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1. S-fl<SS)L£> 


$l[ 5 d, g!T&v SlrflQujL^sij 3,rrLDSsrsro <oTsir/D Q-filGisumfilsv QeuGrrhi5U_uu (/J)£) jd&> l .. CC-BY- 
SA ^)<5<S57 ^LpeVLD, $lh]3,6TT 


• ojf70LCT/fi) uSlrfrB^j Q&rTshmGvrTLti. 

• <oT(Lp&>i QsiJsrrluSU—GvrTLb. 

• susmfls, l?§du5)<SVILDUJ6&TLI(h)&>3i<SVnL£). 


• ^ssineb, ^Lpsvu L/^aab, ^Sa) ffluj rj LDjBrruLb WWW.kaniyam.com urorfluj 
&S]sufjthJ3,ss)sn a,rj GbsusmQbU-b. fg)G>a> 2 _$]<ss)LDa,ss)sn lurrsuq^d^Lb d,rj 

Qsusm(^)L£i. dlrflQujL^sij a,rrLDSsrsro <orsirjD ^rfl&DLCiuSlev GlsuGrrhiSU— (Ssusm^lLb. 


rjjjjeb ^LpsvLD : 


http://static.kaniyam.com/ebooks/learn-machine-learninq-in- 

tamil.odt 


This work is licensed under a Creative Commons Attribution- 
ShareAlike 3.0 Unported License . 
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2 . jfliu!J 2-65)1T 


<3/ <S5)<o57 su65)ijiLj ld GurTGV <oTsmd(^LD e^Q^rs^eh Geu&n<svuurrrfd^LD ^i_£f>§?l&v usvGgutq] 
iShjd3 : 6S)657376T7 67Lp/5 <3]6Vi6ii6V37dSl6V (3/565>6iid(3 ) 7d <3]^I37llt7657 GrsijLb 

^GVGvrr3,§], LDmsvxsv Gprjfhi&sifhsv £Li_/ouu5/odld 3^.i_id Q&gvg u&j 676577Qi usvG)surry 

LiarTrjam (oTgstlE&j 65>6iid37LjuLdi_657. ^/5/D(/f) (LpsirLi Qgussxsv ur7rjd>/5 r£lrr)]6)J657td37677l6V 
3r/5[5(5llJLDT737 £g)0/f>0/ (361165)60 UmjfB&Jlj ULpShlJ <o7W<950, §(l LlQ [765717)] 6£(77) 37/5 [5 §3 TJ id 

£g) Gvsvrrg, fil/T)] 611657/5(51 go Ggu&dgv umjd&d, Q/5r7i_Thj!dluj/5r76V ^ull iSlrj^&sxsmQuj 
£g)0/. £g)00OT) <95 UJ U655fldim ) Lp6V <OT<S37<S0 3 : /d^]ld 6pd,3)]U (3 U 773,6)516065)60. 67657(3611, 
&L-GU_65T!r)] QQi) tprrsn G6 i]6S)6V6$)UJ 6)S)l-($)6)S}l-Qi_65t. " [5LCid(/5/5/5r7657 10 <si/ 0 z_z_o 
'Software Testing 1 3,i65)7nu56b <3i65)iu6iiid ^srrsnQa," 6T6 qjl£> rff65)6$r uiSlsi), l/ 0 / 

(361165)60 Sl65)l_d(3 ) Ld (Lp657(3u, ^(Vjd(d)Lb (36U65)6V65)UJ 6)51 6)51 L_3z_<o57. <^6577760 

6765765)165) L UJ <3] [5/5 10 61/0i_ <3165) I LI 611 (3 LD 6765ld(^ l£ 137U QuflUJ l5l 7jd3 : 6S)657UJ773, 

^9/<o5)ld/50/(Si5]l!_l_0/. <ot/ 670 G>ff>/7& n655T<svid(3,d Q&esrjnnGViLL, (^eiieiienei] < 31 ( 5/37 < 31 eoiuenid 
QarremL- pup 677d3,^n,d(^, Testing-<9500 (3/565>6iiu56b65)60 <o7w/ry 3^/51 
<3165)1 LI l516)5 Ld 1^6575. edtldlS/GL) £g)0/5(3>0 /_/7p<S<S/j/_//_/70 <o7W<S0, (361165) 60 u560603 LD 60 
6)5l1l)-6V £g)0L7Z_/0/ 0/D/ry 37Lj-657LD77657 3777603,ldl _ LL 773,(3611 $jQ])[53)3}]• 67Thl(3) 0136573)77601id 

Qf5m_r}d$5hLirTg; [ffp33ifld3ij uid(3i_657. Testing Q&iuuj (3)65)/o[5/5 < 3 ]^]U 6 uid 
03T76557l_611lj3(36f7 (3uT73)lld 67657/07703677. 67657(3611 Testing 676573)] 033657657360, 

6765165)165)1 UJ 69165)IU6113,Si3 )0 (361165)60 Si65) Z_ LJ U3J 37q.657ld 676577)] (3 /536S7IT) (36U, (36113)1 

0 165)703677160 6T65765)165)l _ UJ ^1/0657365)677 6H677[j^^]d 033677677 (LpiS/Glf 03lLl(3/5657. 


(°£j3}]6)J65)[7 0/53l_[jd(FlUJ33 [53657 3/03)]d0336557(3l_ 6 U[F >/5 GNU/LinUX, MySgl, 

html, css, javascript. Python (3 u 3657/0 65 > 6 u 67657 d 0 65)3033(5)33 0 
G)/5r7i_ihid>i657. ^65)6U3^Q67777(5) Bigdata, ELK, hadoop, pig, hive, 

spark (3U77657706)1 TO65)TOIL!id 37/037/5 0/53l_thlSl(3657657. <31/5/0377737/5 0 65fllU7737 CT 00 <S £0 
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COXLTSe-Lb G&ijGflGVGin&v. Course rBi—^^jusurf^sh^^JD^rrm uuSlrodl^ Q&jrrsft&ujrrg, 
lUlTSSXSST sfihsm&V, (3)^lGS)IJ 6)S)6$)6V (5.95 L_Z_/7/j<95 6)7. <o7W(3(Sl/ S^lGl^SV £g)0/5<gL/ L^QlU 33)33, 
Q^/7i_/E/S)(3(o37<o57. g,(f)(D6ii(D6$)(D Gl3,m_!jr53 1 i uuSIjd d) Q&iugtfLb, < 3 ]g$) eu ujdi^I 3&ssfhuLD 
Lfil&ST&sfljBLflsv <oT(±p§>)ilild, 33 GI sm3<srfl3<sh g-Qjjsundd) YouTube si) GhsusifluSliGQid) 
su^Q^m. < 3 i&)G$t iSismsmrj Bi.cfda.ta. Engineer, Hadoop admin Qurrmro 

u3)G)S)&3,rrG5T GpijgirTsmsvgisinm ^T^njQs,rrsnsn^ Q^m^thjQQssrssr. 


$)$<sv ^^fhuLD ^TSSTSsrGlsussTfDrTsb, rF,rrssr urh]@ GIujdjd rffQjsusmrhJ3&rrlsiT 

Ginrja>nsm&S)eviid <oTsstssisst^ 0<5/rsiy Q&uj&j G)ShGi_mj36rT. ^€51 sv s^sirfDrrmTCS -sv prrssr 
Gsuss)svd(3,d G&ijrB&jGiSlL-Gi—GST. ^^ssr iShsirsmij suLpdaabGurrsv srssrd^^ Glfsiflrsa, 

bigdata, hadoop GurrssifDGurnssirD still® stiii-G®, 'Machine learning 1 sr^jm 

l]§>iuj gjssirouSlsv Gsu&r>svGl3ujuj3 Q&msir&sTrTrjgim. rnmsiniLb ^ijsuQ-pi—sm usv lj^iuj 

6)S)6)$uj(h]3,6$)GfT& 3/nrr)jd Gl3rrsm(G) &rg,r5$£i!jLDrTg; Gsu&vsv Q&iuiug, Q^m^iddiGmssr. 
^l&jjGusmrj prrm 3jD^jdGl33smL_ sS)s)$ujih]36$)SfT smsud,^ /_)/_/<5 <5 <95 <5 <o<n <5 

<oT(Lg^ULi<shG<5YTsir. ^/oGun^i Deep leaming-®£>zj uin/fid a/Brrijd Qarrsm® 

^(tyddlG/osir. 


^j^smrrsv rsmsir Gl3irsvsv suq^su^j STSSTSsiGlsussironsv, " 3foGfD3(r^d(^ G)3svsvilG 
l _ G ldsvsv mb Hd/DULi" :) 


§j. rfl^iurr 


S)Lpd(^ 3,mdurnd 

26 SJLUJSV 2019 


lSisstsst(&3®]\ nithvadurai87@amail.com 



suss)®! u^siy: http://nithvashrinivasan.wordpress.eom 
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3 . S-fBrrrjGSGnhl&GfT 


^ITsSlsv 2 — 6 YT&YT rffijsv 2_^mjsmthJ3,sn lurreym 


http://github.com/nithyadurai87/machine_learning_examples 


£g)/5/G><95 2 _ 6 rrsnssr. 
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4 . £-<5frG<5fT Q&GVGOILD (Lpm 


Kaniyam.com ®i&Lfl<sb Bigdata ldtdpiid Machine learning urorfiku 

&,L_($l6$)IJ3,6$)GfT <oT(Lpg, ^,!JLDlSlpg,d>$5l&V £g)0/7>0/, ^\p^ieS)puSlsb G&fJ 6l5) (TFjUU(LpSTTSTT USV[j 

usvGeupj Gamstflgisinm <oT(LpuiShu susmsmid ^shsnmij. ^surram ^ss)SstsiktT)ID, 

" ^p&jGsipuSlsb G&rj Gsusm^GlLDSSTprrsb, <oT<ssTQ<ssT<ssTQsu<sv<svmD 3>ps> GsusmQLD? 6775/0 

&>p&> GsusmQLD? ®\g>. /D0 fiuJ LJrri _ d> ddLdi_dia,6$)Gnd &jDfQ] (Lpipda, srsijsums^ rsmsrr 

asmfl&sflp gjssipdGa, /_ j^liu rBurjrrev G^pipiurrs, njisiDLpiu (Lp ipiL-i ld it?" 

<5TsirGlpsv6vmD G&lGi—Ttgv, ^&r>su ^ssissrp^lpi^LDrTSST u^lsisxsv G^pipiurra, Q&TTGvaShsfilL-. 
(LpL^iurr^j. ^svrmsv (srsvGsvrrrjrTGVi ld gjjsmpgiqirijd^m nji&mLpiu (LpipiLjLD 

<or<druQ^ utsisb. u<5V(fijL£} bigdata, data science, machine learning, deep 
learning, AI Gurrsirp gjjempaisrjjdigm G&rj 6i5)0/_//_//_//_/(/J)6i0/ surjGsup&d, pda 
s^ssrGp. ^svtttgv ^ppQam ^{pia appeviGViLiLD, GprjpGsipu-iLD QasvGSU— 

GsusmQLCi. <si?zJ_tij_<si5)0/5<5 uipGiu uudlpdl <or(diuupssr ^Lpsv(LpLD, djjf )&>ttsitt usvGsupj 
(pQpdaeiflev ^ssxsmppi uiEja&rfluupeir (Dp&V(LpiD appjd QaDshmevDLD. 


(orQp&jdamh—TTa prod adLDnsisflujnSl ^amuptdlev s^ldttsstid 6£/J_/_ &S](pLbiSissTrrsb, 
Gppispiurra qqt, uudpd) odidiuld Qassrpj uuSlpdl ot® <5 #i/ 6 i 5 )/_!_(£) siSliDnssriD s^idip 6 i 5 )/_ 

(Lpipiurrp]. (LppsSlsv Ldl^leusmip, £g)0 adaij surrassriD, prrssr(p adap surrassriD 
GurrsirpeufD&npQujsvsvrTLb s^Ldipu ULpdl ^jujrB^lprEjasisxsnLj upjfihu ^ipuuGSU— 

<3]d)l<olS)G$)G$T SUGYTrjpgld QaTTSTTGfT G «u sm ® ld . iSlssrssnj ^lupiShu&v, SU IT Sisfl UJ SV , 
arrpf^hudasrdujsv Gurrssrpsupssipij upi^l appjd QarTehen Gsusm^id. ^pssr 

iSlmssrGp s^ldttsstid 6£/J_/_<* appjd QarTsrrsupp(p rErdam piurrrj ^stf/jasn. <3]Gp 
GuTTSSTpj gg)pp&r>aUJ OJSSipasdisb Gap 61$0/_£)/_/61/0/ <oTGSTUdj]ID , 6)S) LD TT 6 $T LD 6 £/J_(/J) 61 / 65 ) 0 L/ 
GurrssrpGp. ^supfflsv pissiLpuj 6i5)0/_o/_/(36i//r0<*0 usvGsupj (Lpssr t£) up pern mash 
2—6ytgyt<o5t. <3]g$) 61/ iSlsireuQpLDrTpj: 
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1. GNU/Linux: 


( Lp^stilsv stilmdsiv s,fDQjdGl3,rrsnsn Qsusm(^)Lb. ^gjGsu utDUU3)tD(3j ^L^uusm _ 

rm&,6$)i6$)i—UJ Command line, vim Qunssiposupptil^ lS^j ^rhi3i(^d(^ stilq^uuid 
^roui—Qsusmi^ud. ^aGsu wind0WS-g> (LptxpGugjLDna LD/DrBgjGtikd® stilssidsrtilsb 
surripd, GlprTi—fhi^rhi&Grr. WindowS-g> esxsu&gid QarT<oSsr(j)l ^empQiuebeomi) arorryd 
Q&mshei]<s$){B stilt— ssaffUiSJesr anGilev <^lLi—3h£ gientpeugn st stilus. 

^,<s(3(51/ ^)/jL/<5^<95<5^)(5l5]0/5^y 61/ fhJ(&)lh] <95SIT 

http://www.linuxtraining.co.uk/download/ 

new_linux_course_modules.pdf 

http://freetamilebooks.com/ebooks/learn-gnulinux-in-tamil-partl/ 

http://freetamilebooks.com/ebooks/learn-CTnulinux-in-tamil-part2/ 


2. Networking basics: 

suss)souitiississr<sbs>sn urorfiiuj ^s>iLq.uuss)is>sn, IP (Lpasufi , Routing, Firewall, 
DNS, VPN QurrssTrosurossifDLi urorSl tilrti&j Q^flrB^i ssxsu^^jd Gl3,rrsn^n,ih]3,sn. 

http://www.linuxhomenetworking.com/ 


li 







3. Programming: 


rSluevrrdaLb QtBtfirs&jj Garrmsfr (Seuemu^uj^j ^euSUuid. " r§lrj(sonds,id <oTS5id(&) Gurjrrg]" 
<oJSS)iLb srsmsmid £g)0/5<g/7<sii>, Python- sSlq^rB^j ^jsurhi(^rhi3nsrr. /j5)<9561//_o <oTsrihu 
QwrTLfl. 2 _thJ 3 ,sn uuj£§)Ig$)G$t 3 r<svuLDrT 3 i fiddlsdUQid). d,fhj <s err usvQsuru LDrriU 
s)S\^ss)^s,ss)err Q&iuGijg,ir)(& ) ^j^jQsu 2_<56i//_o. 

https://pvmbook.readthedocs.io/en/latest/ 

4. SQL/NoSQL/JSON 

iSlsirsuQrjid aLLismLDULi sidlssrsusv GlLDrrLfl3iss)snLj(SQIS) u/Dfflji Gl^rflrB^j s&>su&>g}id 
G &> nsrr (6Y7) [bj @> sir . 

MySQL 

NoSQL with Redis,MongoDB 

http://freetamilebooks.com/ebooks/learn-mysql-in-tamil/ 

http://freetamilebooks.com/ebooks/learn-mvsql-in-tamil-part-2/ 

5. Visualizations 

■5/7<si/.aiso6 yt<s arsmsmrrsv urrrjd,^ Ljrftrb 0/ Gairrshsu^jD^ sja^sua&sr Gus$)rjui_ih]d,s$)sn 

2_06i//7<9506i/5i/ ujnifld) Q^rflrB^j ssisu^^jd QarTsrnsrijrhiasrT. ^3 ,idG)&>s5t Matplotlib 
L£>rD(Qjid Kibana ^Qujssisu ^snsnssr. 
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6. Cloud services 


Bi.Cjda.ta. aQTj^asn [bld^j LDL^daevtiflsisfl ^LpsvQm ajnrQjd Qa^rrsnsn 

(LpLij-iLj ld . ^g,inQg;m (jasminfig)Lnd&LD 4GB (Lpg,6V 8GB eus^/juSlsvrrm RAM Q^ss)su. 

AS/VS, Digital ocean, Google cloud platform. Azure 

^dhusujDts^fou u/orfifs Q^flrB^] «n su^^jd Qd,rrsn^fr,thJ3,sn. ^ssmsb sj^rrsuQ^rrqf) 

cloud Service provider- sSltr^rB^j a/orjyd Glainshsu^j (g)657(g)/L0 d)rDuurr&> 
^ssiLDiL/Lb. (^}&n(&>d d)rnlg,i Q&svgij ^ssinspub urrsunudlsvssxsv. /_/££)uj virtual 

<950Sl5]<95«f)(5Y7 5_0Sl//7<95®, TB ^m&dsVrT&ST 3, fJGy&GSXSfT 3rSVULDrT3i prOCeSS Q&IUUJ 

^)<5<s«f)<g5uj cloud services 2 _<s<si/ id. 


7. The Big Data tools 

Hadoop, Spark, Pig, Hive, Scikit Learn, Tensor Flow 

QurrssT(Dsii(DSS)(DU ujDdjKoUuevtsvrTLb a/orQjd QamshiQrijrhjgim. ^)« dsi /( 3 uj pmd 

Q&6V6V <si 5 ) 0 lol//_o s_su<s< 5 ^y<s 0 / 5 /_D«f)LD ^ss)Lp^^]d Q^svsvnd. 

http://freetamilebooks.com/ebooks/learn-bigdata-in-tamil/ 

http: //tutorialspoint. com/ 

https://www.kaagle.com/ 
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8. Maths/Algorithms 


usytstH, aisv^jfluSKSsvQiu 3ismd(^ (LpL^rB^j si5)l_z_0/ <oTG$T(r)i prod rff&r>ssrda, nsot id . 
asmd(^ / 5 /_d«oz_o e^(^)<su^l6V&D6V. LjsrTGfiiifilujGsxsoij u/n$ ^\ssrss)iid aroru Q^erflsurrad 
ajor^jd Glanm^rhiasn. ^)^y(3si/algorithms OTsi/si/mry (r^ su rrd a uu(^)d] sirjosin 

STSSTUSD^U L//$/50/ GlarTSTTStT 2_<561/Z_D. 


9. Community Contributions 


LDfDjD<suij^<^d(^ s£0 ®S)s)%ujiE>G$)&>Lj upi^Jd apr^jd Gla,rr($)d(& ) Lb(3urr& ) i&)rrG$T, prod 
<3isv>g,u ufDj& (% )sstss)Ii1 ) ^LpwrTad appjd QarrshdlQpmd. srsmQsu tEarasir appjd 
Qarrsmisng,u uprSl ^issT(Lpid spqij utsisy srQ-g^i QsusrfluSlQd thrash. ^mdlsvishsrr 
(^(Lgdashlsv ^ssxsmp&j, (Satjrbaj ajDpjdGlarrsht^rhjash. Meetup. COm £g)<5/D0 

2 _<g6i//_o. 


^snsuQiusvsvmd urr/juu^jf)^ LDsnsvuurra ^jcmdarrsviLd. spsirGlsurrsirtDrra Qaiinup 
Qprri_thi(^thiash. fithrash si 5)0 ldl//_d ^iurjp^jf)(^ 2 _thrass)sn ^ss)Lppajd Qaevsynd. 


)s$)a> LSsmQid Ldsm(rj)id tflsnmsy uQpisld Gl a rrsht^ thrash. 


^jesremroiu ^smsmurid (s^ip ssoSleb, 2_/sf<s^<950 ^0 aSle^imd Qgfiujefilsbssxso 
ereirpmsb, <3l£fD(3)d arrrjsmid ,rfjihiaeh^6S)g Qgfflrsg] Garrehen (ipiuiD® 
GaiuiuaShsbesxso .. 
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5 . njGsrrSl 


Machine Learning - Online Free Course - Andrew NG 


https ://www. coursera. org/learn/machine-learning 


£g)/5<5 £g )ssxsmm euL^lu um_Lb, zj 5 )<s 6 iy/_o umss)isnsn§i. <5 su/n s^i—rr^rfansh. 
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6 . 


^jiurB^lijeuL^d a/oroGV srsirug] ^roQurr^j ^^ls,LDrr 3 , susrrrfrB^] si/0®s5jtd s^0 &Jss)rr>. s^0 

&stffl®sfld(& ) ajoiSluLidi], < 3 ]d,iB(&) < 3 idj) 6 y /_/■*/_!_(/))<si/ 0 /, /_/ audi—uu /_!_/_ ^rfisfilsir 
<jytij_/j/_/«r>z_u 5 )< 5 i> 3,smf\ssf]3,ss)snQuj (LpL^sid&nm QLD/oQairTshi^LDrTQj Q^iusu^j QurrsirfD 

usvQsurjj sflsyiUJtd&s&m ^iu[5&>lrj6iJLfld aifDjDsSlsv arTsmsvmd. ld^^sst Q^iuQssrro 
Qgug$)6V6$)uj Qsi/ffyzi) rffusvaish <oT(Lg^i a<s$sfl<s$?l<s$)ujd Q^iuius^euuu^sir Quiuij 

^jiurB^IrrsuL^d a/n/nev ^&>sir Quiuj &,rr<ssf}ujdaw (Automation). 

LDesf]^GS)SSTU QurrssiQj 3 ,smf\esf] 3 ,ss)m (HiurrSddg; ssxsutB&j, (ipi!j_si/<95<o5)S)m//_o ^^ssxsst 
s5)si/<s(3<5 <or($)da, ssxmlju^j, ^ si/ surrfry <or($)ds,iju($)Lb (/piij_si/<95S)T ^ujrB^lij^^mLCirTai 
svsvrTLDsb ^l^uusidi^uSIsv ^sm£isu^fD(^ <oT6$tG)gstg$tg$t Q&iuuj Qsusm^Lb, 

^sijsurrru Qiurrdldai Gsxsmju&j <oTsi/si//7/ry ^rr^^liuuuLdi^^], ^i&lGVimm suLfltLpssm^sn 
<oT<s£r<S$T, (s><9> ITLdun(j))3iGYT <oT6$tQg5T6$TG$T 6TS3TL0/ QufJ&STPO ^SS)&STd,SS)3]LLjLD Sl5) STfd^SUQfF, 
^jiurB^lijeuL^d apinsv ^0/_o. 

^jsujDs^fDGhusvsvmd Q^iusu^/o^ Qsi//ry/_o <g<95si isv Q^rrLflev^jLdu ^$( 3 surr® 

LDLd@LD6VGVrTLDGV, 3iSWsfJ^L£i, LfSnsdiuShUGb QurTSSTfO LDJDJD &J<S$)fD<9iGlfl6£]Lb §][$}§} 

<3]isi-LiuG$)i_ < 3 ]rf)l< 3 )S)G$)G 5 T suGmjtB&jd Q&rTmm (Severn® id. ^uQurr^i^rresr rBidumev 
&&VULDIT& s,Gffif\G$f}d(&)d 3,(f)Q]d Qa,rrQ)d3, (Lp iij-iLj id. Qld&^ld LDrrrpyid 

(m,LprffsiDSV3ii^d(^L£i, ^pei^^^d^id ejrpu {pld&j (S&rrjpesKSsr (Lpm.ei^3,es)emLjid, 

■95 SlStiflULj3,6$)6YTILI id LDrTfDfSl SI/ Lp(hi 0 Sl/( 3 <g ILI [B® IJ6)J ifld 3fDIT)6)Sl 6$T dlfOUL/U USmLj 

^,0/i). £g) 65)&, (5 uj Adaptivity ereir^j a^rpeurf- 6Td>Q&,d>&, ev6$>3iurT6$r (3,ipr§l6$)6C3eb 

^jiurB^lijeuL^d a/n/ngv/<s0 6ul£\ eu(^d®6$riT)6$T 6T6$ru6$)^d dG>Lp 3rT6$$Tsvmd. 


LD65f}&6$)i6$)i_uj ^tSQjueu ^rfileurred Q^iuiuda^upiu Q^iusv^eh: evrr36$nd 6pid®eu^], 
s^ 0 si/ 0 <S 5 )/_ uj (^rj<ss)svd G><szJ_(3z_ ^ssxsn 36$$f\uu3j (SurT6$TfD6$)eijQuj6V6vmd s£ 0 si uj 

36$T6$)I6$)I_UJ ^9/OTy/JSI/ ^fpl 6U iT6VI id, 3fT3lrflLld&nGVIid Q&iuiud <9^I!J_U_/S5)S1/. 
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^]<sujDrudGl3i6V6vrTLt) (SrB/jL^ujgg rflrjevgim <5T(Lp§>l gnsissflsisfld(^ Qggsv&Sl^rj (LpL^iurr^i. 
<31 fig, <31 ss)iusudssig,nj f_o <3i$G$)suiLj ld Glg,rr(filfi^ifi^rTSST rsmb 3,smf\ssf]ss)iuij ULpdg, 
Gsvsmt^ud. Qld&^jld ^rfi^jssifD gmjfig, sfilsyiujdj gasmen rsml) gemsfleisfld^u /_/■*/_!_/_ 
e^q^LDLidlQfDrrQLDrT, <3ifi^iemjo gmjfig, euev&^jrBjgerr ^Lpevib gemsflsisfld^u Qurr^liu 
<3lrfjlefi\e$)<s$i suLfidig; GeuemjjLb. ^emg,(3uj domain expertise <o7w/ry ghLiryGurj. 


LDsisfldj gd^iemuj lS(£i Q&iluli QsvsmL^iu Qgujsvgeh: eidsmGlsuerrl, L^Ggnemb, <3]b))e>duj6V 
Gumsirfn <3ies)esrfigjfi giemmgsrrlsviLCi uevGeurru Ggrr^emmgeh pi-fig,uu(fi)Qsir (dsst. 

< 3 iemeu Qeu/offlemujfi g,(Lp<s 1 / (Ssusm^GlLLSSTrorTsb, hitjdGI gemGeu G>g,rTeve>demujfi g,(Lpefihu 
(Lpfiem^uj Ggrr^esxssr (LpL^eiigemen <or(fi)figj pmufigj QeujnffldgrreisT gemsfluiS) emend g^jo 

(Severn® ld. 2_g>rr[jGmfE>& l id(& ) LDtVjfigjevd, gjemroemuj erQlfigjdQgrremi—rreb, 
idhjgevfi^lesrGurrgj Quessrgseiflem ^jnuL-1 eddied ereiru^i urr®ld(^u urr® uj ng 

£g )0 fi&>&tl- $]g,ememd (& ) G$)(DLiug)rD(& ) ^gjevemij idUjgeufi®le$r(Surrgj ^jjofi^ 
uevGgm^dgemdgrren Quemgeifleir ^luevj^demggemen er®dg (Severn® d. <3ievjDruerr 
e^evGlevrrq^ Gluem emi d erg,emrev ^fDfi®l(njddfDrren, erfi <5 emen Gluemgerr ep(Srj 

evemguj nesr grrrjemfi^rrev ^jDddjDrrjgerr, erfi^emen evemgujrrem gmjemdigen )jDULid (g 
evLf\ev(g,ddem(De$T eremugj (SurreiTjn eS)eq,ujdigemenG)ujebevrrd gem® iShq-dg 

(Severn® d. £g) G)g,evevrrd 2_ememLmi5KSev(Suj weisflg, gd®d(^ <3]uurT(BudL- 
Glgujebg,rrem. ^gGev ^ev/nem/Dd Qgiuevg,jn(g) gemsflesflgemenu ULpdd, ep(Srj 
urn® jhurrem pattern-ev ^j(r^d(^d g,rjeiigemend gem®®L^ddesTjDsnj- iSlememj 
<31SU(DSS)fD eT®fi^] /_D0<30/61/ 6 UGVGVI[f)Ttg>&YT Ulfld&Sldgi " £g)/D ULjd(&) 61/ l £1 61/ 0 <S 0 LO 

grnjesrflgen" eren (Lpvj.eii Glgiuden/oenj. idesrenj $]g,esT < 3 iLq-ijuemi_uSlebg,rte$T ^(f)Gurrgj 
gjiSiesdlujrrg evQTjdesrjD GluemgedU_d ^ev/nfflev ejGgemid epeirrru Qg,esTudi_rrev < 9 ^ l _ 

^_/_G<o 57 <3iru&s)su dddemg Qgujgj eS]®desTroesTj. ^gGev^rren ^jbQurr^i urr®d(^u 
urr® GluGmg,(§rr,d(& ) <3iQjemev dlddemg Glgujujuudi^rrevid. £g) jduli s d)dg,d ereiru^i 

(LptTgevgjLvrrgd (^em/ofi^i 6i5)/J_/_0/. 

^]ujfi®ijrhjgi^d(^d gjD®uu^i ereiru^i " eidevdi^gerr ereiievniru grodemroem" eremuemg, 
<31Lq.uuemi_ujrrg emevfi(Sg, ^ijitujlju lGi^^i. $]S>l6V /_5)<sotsi/ 0 ii) £g )0 GIuq^ld 
Gg,mdurr(digisn (LpdQuju udi(&) suQdQmrossT. 
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Bait's shyness: 0lossi/_d Frrfd^LDnr^j ^(rTjd(&)LC) ^sm&S]ss)Ssrd a,sm(^ ^/(sj^si) 

< oTsiruQ <5 £g)0S57 GlurTQ^err. ^^rrsu^j s^s^ld 0/_ si//_}/_//_!_/_ ^ssrrrsb urrrjuu^fb^d 

< 9 >Sl// 7 t$< 9 ih_/JJ_//J SUSS) 3 ,uSlsb £g) 0 <®@Z_D SSI SI//_!_/_/_}/_//_!_/_ 2 _SS 57 'Sl/< 95 S 5 )Sy 7'<95 asm® < 0 T&Sl 3 iSrT 
^d&Lb (ol 3 >IT 6 YT(< 5 TljLD. STSSlGSI/ < 3 ] S 1 //D SS)fD (Lp(Lp 3 )rr& L_Gl 3 ,IT 6 TT 61 ]&)(£)(&, (Lpsirmij, 2 _S 3 WSl 5 )sOT 

90 l/ 0 ^)s 5 )uj <ot(^)^^j (Lp^&Slev &rruiS)(jb)Lb. ^^smrrev 0 S 57 .S 0 lurrQ^rrq^ urr$ 5 lu /_/z_o 

^svsmsvQujsisflsv 2-6m&m6vmd (orsirruLb, u^Iuli G>r5ijr5g,rT6V ssmsm (3si/ssk/_/7z_o 
(oTffsrrrijLb (LpL^Qsu(^)d(^Lb. iShsirmij LSsmQLb <3iG>g, Qurrssrro 90 2_ss&rsi5)s5)S57 z_D/ry(ipss)/D 
air rsmi ld (Surra,!. ^rrssr ^T®<ass7(3si/ rsi—^bsliu Q&rr^ssxssr (Lpm.s\^ 3 ,sbrlssT ^Lq.uuss)UuS\sb 
2_smsmsvrTLDrr Gsusmi—mim < 5 rm (LpL^Glsu(^)d(^Lb. £g)0/(3si/ ^ujrB^lijsuLfld afnirxsShsviLb 
rr,ss)uQuQjQ(D§j. 

LDrrQuq^Lb 0/j'si/<95syfl<siS)0/50/ 90 £)/ry/_/0£5)ss)UJ st(5)<9>0/ aumflswsfhurrsm^i (Lp^&Slsv 
^1 TruL/Lb . £g)0/Gsi/ sampling <ormuu(^)Lb. ^ddifrtf u<^^i <5/rsiy<asY7 training data 

<oTG$T(r)] <3i6$)Lpd&,iju($)Lb. ^^^fjsi^ 3 ,ss)sn 2_smsmsvmb) Gsi/sssn_ mb srsiru^j (Surrsirru 

sivsss<s/j/_/(5)00/si/(30 labeling <oTssiuu(^iLb. ^/i)(xptij_si/<95S5)sy7'S5)si/00/ si/0®ssjtd l/^)uj 
^/ 7siy<g5S5)SYi<a ^sssfiuu^i Predicting' the future data <or&mju(^iLb. ^^jQurrsir^j 

usvQsurn ug,rE]&< 5 YT ^ujr 5 $dlrjGiJLj 5 ld ^jd/dsSJsv uiussru(^]^^iju($]QssTrDSST. ^ssrrrsb 
^j^jQurrsiTjD £5 rfm am dig; err Sdsv&LmuLb 0si//d/t<s /_d/77/5)si5)/_si//_o suaiLiuLiehen g/. £g)g/D0 
90 !djD[ 5 g, 2 _^mjsminag, iSlmsiKVjLb (Sg^mdumdmud g^jnsvaid. 


Pigeon's superstition: Li/oadg;erflm ^sujoam g;mfluLi <5rmau pmb ^jm^d 
g^rneomd. 90(xps5)_® B.F.Skinner <ormiid LpGmrr gggj/si/sifl/usu/rsw Lipnda,ss)sn 

ss)si/gg/ ^/,/usiy spmm/D [ 5 urs§>lmarj . < 3 ]§>l 6 V usv LffDadg^rnsn s^mroag, g^smL^jrx^sh 
ss) si/gg /, < 3 ]G$) 6 Ud,(§rr,d(& ) 0 // 5 )/j/_ 5 )/_!_/_ a;asv ^]muQ<suerilg;srrl6V s-sssrsiy Q^sirru 
G&tvjLimrri] ^assbiudiQ spmmjn ^mm^^arj. ^gjjsyid &rfhurrg; Q&iusvuLb® s^siiGlsiiaq^ 

(Lp<o5)jr)iLj ld s-sssrsi/syfigg/ si// 5 < 50 /. nrorrd3,sn^ssrd(^) <s^ sd Qsa a (r^(LpmrDiL\ id s-sssrsiy gtuljl q. 
si/ 0 ®/D 0 / <5rmumg,d g,sm($]iShn.dg, ^pGrBuptsisv ^g/ srssrssi Q^iu^jQairTsmL^q^rB^^] 

<5rmumg, <ssi/ssf)<*<95g Q g, a u di dhu§] . ^9/0/7si/0/ 90 L/joa g,am 

^rnsvium^d^idCSuaGl^sbevaid s-sssrsiy si/0si/g/7<si>, g,am g,msvujmg : Ljug,a6vg,am 

0(037.950 ^.sysrsiy Sl/0®0)(fl)0S37ffy/_D / /Jj/dQ_©^0 LIJUfT^j^] 

(^^l^^ldQ^asmL^Q^d^idQuaQ^evsvaid ^.sssrsi/ si/0si/g/76i> < 3 ig,ma<svg,am 2 _sssrsi/ 
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(a/0®/DQ^«ff/py/_o tfissxssriB&jd Gl^rrsmi^gj.. 


^ssmeb ^jSS)Suu5]rjsm(^iLb gjjnQ&imsvnag, Q^ni^rfunm spssrQjD ! (temporal 

Correlation) . 2_«rerao/_ou5)<si> urrrj^^nsb <oTtT>3,<S£(rT) ^LDLorB^Q-pid) ®«»/_uj(7g/. ^ssmeb 

HJDrrQsurr ^sijs^ijsmQld^Lb ^surorrm <s^0 Q&^m—ijiSl&svsxT 2_<smL^nd§>l, 
<31&)G5TLq-LJuG$)l_u5l6b & SVtifl LJ iSl 6V> 6VT fffdiLp^^l <5)51 ®QJD 0 /. S)ll2-QlJG5T 3,mssflUJd]Si 
Q&iusvuQlc) QrBrjid) LDnfDjDuuid.® gsmsy surjg, Gl^m^rhjShu^]. ^ssrrrsb ^jsiD^iuj^hurru 
LijDrrd&srT ^smsvius^^^ju umj^B&jLb, umj^^jid gsmsy sufjrr^rrsb, ^^ssr 

<oT6V)i _0«n i ®uj<5 Q^rTL^riiSliu^i. gsmsy snq^QssTfD G pulsing, ^fhurraid g,G$rf\d&>rrg ) ( G&> 

^gj/D^d < 5 nfj&ssrid. ^sijsurrjQjLb [bld^j ^\iurB^lijdig,^TT,d(^ ) pi—rB&jGiS) i—dg^i—rr&j. 
gjjDQ&iu&vrra / 5 «o/_(a)z_//ryLO Q<5/7 z_/t/_/<o5)/_uj T§]g,Lpsi^g,sdlssr rffaiLp^ais^ ^^laacrrai 
£g)0/50f7<si> s^^lGsu [BLDg] ^]iurB^flfhi3im rBLDd(3) <3i6dld(&)Lb answsfluurra; ^(tyrB&jstii—d 

< 9 x 51 . L _/ Tffl • 


<oT65}u51g$)IG$)I_uj 2_^mjsm^ss)^Guj L$sm($)Lb urrrj^^rrsv &mjiS)LGi_G)ji_6iT, <s ^0 
LSlssT&rrfjd gabiShiSlsv ^l^ulL® urr$BlLju&v>i_&>)fD(old)6isfl6V, ^sijs^ijsmQld^Lb srrBg, 6^0 

Q<5/7L/r/_//_D ^jsvsisxsvGlujsiTu^] /oTsSld^JB Q&>ffiu-iLb. ^gjjGumsirrru ^t<s61u51i_l£> 
3,rrsmLjuQ)QssT(D <s ^0 ^Lq-uusmi—iurrm ^j^h^svsuQiu pmd g^emflssfld^u /_/■*/_!_/_ 

Qsusm(B)Lb. ^) 0 /Gsi/Inductive bios srsirru s^ss)i~pd3,uu(^]Q(D§]. Biosed 

<oTsirjDrrsv umjuiG&Lb urrrjuu <s ^0 ^ssismjuid^LDrrs, ^) 0 ul/|) <o 7 w/ry Qumr^sh. 

Inductive bios <oT6$itDrr<sb ^jiurB^lrj^ 0 <o 37 /_d/ 7 w (Lpuj-Gij&sisysrT ^uul^Quj 

^j(DQjdGl3,rrsnsnrrLDsb ^tS\s)S]sst ^Lq.uuss)i_uS\sb urrtjuid&u uQ^fBlu urrrjuu^j STsiriru 
Qurrq^sn. ^]^]GurrsirjD 3i&stsfl&sfld(^ ^stHuu^jd^ ^TB&jssitD &mjfB 0 

GusvsgjrBfj&srr G^ssxsu. ^surfaGsn domain expert toTSifrfjy ^&r>i£dauu(B)&!6iST/D&(Trf. 


^jiufB^rr <svy5)da/D/D6ir>60ij iSlssrsuq^Lb 4 &S]^djg,sdisb iSlrfld&svrTLb. 

1. Supervised vs Unsupervised Learning: 90 rB<ss)i-Qugu<su^iD(^ 

g-snsr?® < 5 T sir stir <su it&> ^)0«« QsiJsm(B)Lb; Ghsusrihid® ^TSsrssTsurrg, ^q^ds, GsusmQLb; 

^\sijs)S]ijsmss)i_iLjLb <5TrBG)g>rBg, s^^i^sdissruLq. ^ssxsmds, Gsusm^nd GumsirjD 

^ss)Ssr^ss)^iLiii> Q&nsbsSldQ&nCdidjQid a,sm,g>nsssfiuu^i supervised/structured 
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learning' <oTssiuu(^iLb. ^^rrijsm^^jdi^ rsnid a,ss)i_&d^ld(^d Q^sirru Qsijsms&ii—damu 
<sijrTd](& ) Gijg,jD(& ) (Lpsir, ^^sst rffjD^s^i^u urrr}^^] ^susurrQro ^Lq.^issf]ss)UJ (Hsv&rrgid 
Qsnsdlu urrrjuQumd. u3rSS)iDUjrraLSlq^g^Gurra, ^q^rB^rrsb GurrihJ&GvrrGlLDmiryLb, 
a,Lq-G5TLDrr&, u(Lpuurrg; ^q^rB^rrsv Qsusmi^rrQLDsirruLb (zpnj_siy Q^ujQsunih. 


• s£0 G)GUGmG$)i_dg,rrG$)UJ surrids,svrTLDrr Gsusmi—mim otsot (Lpu^sy Q&ujiLjLb 

&>rrij6$$?l&6iTn6$i <s>]3)<di rffjDid ^sirssiLD QunsirjDSSKSU domain Set <orssuju(^iLb. 

£g) ss)<su 0 uj X srgn/fi) ^errerfidL^sv 3,rrsmuuQ)Lb. 


sundi^sorrid, (Ssusmi^nid <orss)iLb LO@/j/_/<ssff labels loTsirfnsiftLfid&LjuQLb. 
£g)« 061 /( 3 uj Y 67 g 0 /fi) GlSUSTfludldL^eb ^S^LDlLjLD. 


• s £0 mapping function -^ssr^i ^snsdidu^sb ^sytsyt ld^I uL/3isis)SYTiLiid, 

GlsuerrhididL^sv 2_sy7w LL^iuLj^ssxsmLjLb Q&iuu-Ild. Qldsstss)ld -> 

si irrrhjaisvmd, s,Lq.ssrLb-> Qsusmi^md, u&tG$)LD-> si irrdjaisvrTLb, u(Lpun~> 

(3<sij6mi_nLb. ^j^jQsu rules Set <oj&mju(^iLb. f: X -> Y 


• Rules set a/niSi^g, &Si^ia,sdissi s>iLq-uu6$)i_u5]<sb aijD^jdQainsrTsu^j learner 

<5Tmuu($)Lb. 


• Learner 3 ijf>rudGl3irTsmL_ si5)sty uj/ 57.asyflssf ^l^uus^>i_u 51 sv n^&rrg, suq^Qssrro 
2_GrT6r?($)&($>rijd(& ) Ghsusrihid® ^rmssrsurrs, ^(^d^id srm (xpnj_siy Gl^iusu^j 

predictor <orssnju(^iLb. ^j^nsu^j l i§^&,n&> LDjDQjnrrQjj Qsusmss)i_d3,nss)iuLj 

umjdi&fLbQurr&j, ^^.ssxsst LSsm^ud G&rrgjsmm Q&iuujd; Q^s^suuSlsvsmsv. $)r5g 

(3<977<5<o5)S37 (Lp tiJ_Sl/<95 sdlSST Lq.UUSS)L_udQsV O UJ Sl/(7/5V<956U/7Z_D, (3si/Ssk/_f7Z_D CTS37 
(LpLSj.Qsu(^)d3i6vmd. 


^)<5Sios!S7classification LDfor^id regression (otssi^'Z <sfiig,LDn&u iSifid^svind. 
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Classification-Si) ld^Iuli <5 j(3g,n spqrj suss)3,u5\ssi dLp ^ssildil/ld. Glsusmsini—danuj 
^^rrrjsm^^lsv surrdjaisvrTLt) Qsusmi^md <oTgn/fi> suss)3,udssr &Lp ^ss)Idsuss)^ 

$]3)JD(&) 2-&>rnjGmLbrrg,d Q&msrTGnsvrTLb. RegreSSion-sw ld^Iul/ spqij 2_smss)LDiurrssT 
(Lp(Lp LL^luuns, ^(VjdqjjLb. suuSljnji^svisrrsrT (& ) Lpd&v>&6mu Scan Gl^iu^j ^ijmurB^] 
lSIjd d(&)LD(Surr^j sisijsueYTS^ <oraoi_ £g)(/ 5 <*(gz_o <oTssruss)g, kg-<si> a,sssf]^^id 3^ru<suss>^ 
^{5(0(3) rrrjsmLDrrg,d Q&msrTGn&vmd. 


Unsupervised Learning - <sv GlGurryid 2 _ 6 n®rriL($)dg,nm LDtsluL-iaerT LDidQGjLD 

3>it< o5$rUu (J)z_o. Glsuerrhidid^ldairTm ld^Iul/ (oTsirmsurT^ ^(r^d^QLDsirQjDrT, 

<5TsijS)d^l3iSrflsiTUL^ <3iG$)LDlLfGlLDG$r(3(Drr <oTr5g,6p(Vj GUG$)!J(LpG$)(DlLjLb Qs$)l_UJrr& > l. Glsuruid 

2-6rT6f?L_($)dg,r[G5T LD^IULj3,SS)Sn LDldQGiLD ^IjmUfB^^IsSlq^rB^l spqij pattern-^c* 

3ism(^)iS)L^^^] <3i{£G$)G5T QsusdluSid(^]d3,rKSST a<s$sfluurra, rBwdqj) QsuerrlLju(^)^^]Lb. 
ggjp&n&fr clustering LDforQjLb association <oTgn/fi> ^rrsm(^i <sfilg,LDrrg;u iSlrfldasvrTLb. 

sumq.dss)3,ujrrsnij3,sd}ssT s^q^uu^^lfD^) nqjDfDrrijQurTsv s^jDus^miurrdlsirjD 

Qun(^LLa,ss)<snd 3,sm(diiS]Lq.^^i suss)3,uu(^i^^isijss)^ clustering- d(&> 2 _<s npemLDn&d 

QairrerTerTsvmd. ^j^lsv s^jDus^miurrdlsirjD ^fjsi^s,ss)sn ldlLQGld 2_6rT6f?i_rr&, 
<oT(^)^^jdGl3irTsm(d), ^^sst QurrddlQsvQiu Q&sirrru ®S)(dug$)G5tu5}g$t QurrdQssxssrd 

(sales pattern) a,sm(diiS]L^d(^Lb. jg)sijsurr /ry asmQiSlLq-d&LjuLdL- 

S)S]sufjthJ3,ss)sn <oT(^)^^]dGl3irTsm(d), £g)(3;5 umfslffiiumssr (3si//py <5Tr5G)g,r5g, Qurrq^Ld^sdlssT 
L8Qg,6V6vmd) sum^ds^anumsrTijsi^d^ e^Q^uuid Q^rrm^id <ormd 3,smf\uuss)^ 

association-<S(g rrrjsmLDrrgid Q&rTGrTsnsvmd. ^air ^Lfisvid s^q^uudjaierrlsiT 

SdLp <3i6$)LDiLjLb usvQsuru Qurrq^LdaisrflsiT ®S)(dug$)G5tg$)UJ pmc <3]§>l&rfldg;6vrTLb. ^j^jQsu 

unsupervised/unstructured learning (otsstuuQld. 

Structured ldjd^jld Unstructured ^)«nsi/ ^rrsm^d^id ^siDi^udsv ^ssildsu^j 
semi-structured learning (otsstuu^lL. s^^rrsu^] s^q^disv ^rjs^^eh label 
Q&iLnuuuLdQLb, ldjd/dsis) si/ label Q&iLnuuui—nLDSviLD ansmuuChU-D. ^smss)LDu5\<sb 

rffaiLpairTsv^^lsv si iq^Qssrrr) ^qsyansrT rBwdqj) ^id(Lpss)rr>uSlsb^nssT ^(^d^id. 

usoQ&mq-d&Gmd&nGisi fBrjsij&GsxsfT ^rrmutT>3)i label Q^iusu^j <otssiu^i ujldjdjd^j . 

^jsijsunQir) ^sinsaLsssigd/zi) label Q&iuujrTLD6V si5)(/5)si/d£//_o 2_<gsi/(7g/. (LpddhuLDnsKrswsi] 
label Q&uj uj uuidi—rra (Ssusm® ld. ^j^jQuneirrii label Q&iLnuuuLdQLb 
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Q&iLnuuui—rTLD&yLt) ^(njdt&jLC) ^tjsi^3,ss)sn rBrnd GLDjn&smL- ^rrsm® ®S)&(h]g,Gfi}6ViLb 
jnujGvrTLb. LD<ssf]^l lSIq^s, (Lp3,dj3,sdlssr aiswsfluLi, (&)ij6bg,Gfil6$T asttfluLi (SumsirjDs^xsu 

^m(y^sis)jDu5l<sv^rT<sir ^i&riLDiLiLb. Structured (LpsmjBuSlsi) label 
Q&iLmjuuLdi—GUfnsiftfo LmLQii) training' data-^®® Q®/7(p<5d£/, ^^s^L^uussii^udlsb 
LDjDiDSS)Sus>ss)<snd &>6$5fld&>®)nLb. Unstructured (y^&nrDuuu). label Q^ilhuuuiLl^ 
LD/orruLD Q&iLnuuui—rTg, ^ssxssr^^lsSlq^rB^jLb spqij pattern-®*)® 3ism(d)iSJL^^^j < 3 ]G$)&> 

<S5)61/<3jd£//_D ®<5S3fl®®<SUf7L0. 

2. Passive vs Active Learning: su^Sim/D ^,rrsi^3,ssxsn^ uui^Quj 

<5J/DrrudQg;rTsm(d), Gl&rrQd&LjuidL- S)S]^l3,sdlmuLq. jmurB3,(D^]dGl3,rrsnsu§ l i 

passive learning <oT&mju(^iLb. sp(nj iJbmsm^&ev SQdlR-^, ^GveinGOUjrT <oT<sm 
(3®/7^/j/j«n<5 $]3)JD(&) 2-g>rrtj6mLDrrg;d QaimsrTsnsvmd. $j§>l6V (oTs^xsijQujsvsvrTLt) Spam- 

®*>® @/)5)®(g/_o si/f7/7<5«n<5®SY7' <5Tsiru^] 3nmflswsfld(^d &jniS)d&uuLd(d)Gfil(d)Lb. <oTsm(Ssu 
Hd}!{£rr&, suq^dhsirjD <s£0 LSlssTssr^^eb ^)<5<5«n®uj GumjfE>G$)&>&,Gdl6b sjQ^^jld <s^ssrss)rr)d 

QanrsmL^tT^rB^nev Spam folder- d(^id, ^jscesxscQujssflso inbox-®(gz_o 

/5®77<55^z_o. 

Sills*.QfJSST Spam-®@/D UJ < 57755 ; S^0 61/77/775 <55)75 (55) UJ (7/TO Q®/7<55k7!J_/7/7Z_D<si>, ^SSmsb 

&[5(3g,g;j5{5lfn(& ) rfhu GULfida^tsljDig, LD/ 7 /D/ 7 ® spQTj iSlssissu^^eb euQ^dljDQ^e^ev (anamoly 
detection), < 5657 §] &tT>Q&,&>(b6$)(f,d, ^rf^^jdQainsrT^Lb QurTQTjid® uevQsuru 

(3® 6)76)5)® (55) 6)7 <57(7£>L7/_5), ^^(D&rrGST 6)5)<55)/_®<55)6)7L7 /_///J <557/7® 6)5)/_ 7/5) ( 75 / 5 ^/ 

QufnrrydQ&nsm® ^^ssiLq.uuss)i_u5]<sb &ibib<5is)6vj 5 Q^m_rhi(^^j active learning 

<5Tmuu(d)Lb. 

3. Adversarial Teacher Method: Spam filtering, malware 
detection, biometric recognition QurresTrosurbrSi^stisbe^md, ^diifhurj 

Gumsirrru sptvjGuij Q&iusvuLd® Q&rrQdgiLjuLLQGnsn stitdl&srr 
LSjDuu^LbQurTQ^rr/^rBQ^dldaiLju^udQurTQ^rT <57^5/ &ffi <57^5/ <g 6 i//ry <5 tsstuss)&, 

loTQpgianijLjunij. Glsuruid <5/76i/®(36)7/7(/J) ^]Lb(Lp&DrnuSlsviLb ®<555f)<55f)®(g® 

3ijf>iS)d3iLju(d)LbQurr g/, airrijsm airrfhu (Lps^fouuL^ ®/D^®Q®/76)76i/d5 i /75®/7<557 

<suniuuLj <3i3,jD(&)d Qsfti—dQfD&j. 
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4. Online vs Batch Learning: r£iLfiii_Lb LDnrry&isirfn ^,rjsiis,ss)snd 

3,6m&>n6mf]&,3)i ^^mL^uussii^udleb a/Dug] Online learning <o7W6i//_o, surj 6 on/Dfi}j £5 
{Brjsij&GsxsH <oT@{ 5 g)]dQg;n 6 m® ^^ssiLq.uuss)i_u5]<sb 3 >[du^i batch learning <o7W6i//_o 
^smLpd&uuQLb. Stock broker 3,sssf\d(^Lb g£i&,(£6$)&, online-*(g 
2 _g,rT[j 6 mLDng;<sijLb, LDd&srr Q^nss)3> aewfluLi rBssii—QurryLb 6£l3,fB6$)3> batch-si(g 
5_<5/7/7«f57'LD/7S5iSiy/_D GlsirTshensvmb . LDd&Grr Qg,rT6v>g; &&mdQg;($)LjiS)6V 1970 — 80, 
1981 - 90, 1991 - 2000, 2001 - 10 < oTsiru^j Gurrsmiru usvQsu^j 

u(^^l3ienrT3iLj iSlrfld&uuLd® s^sijGlsurTQ^ 10 si/0L<5^/<a@/_D ^lusy rBsmi—Qurrud/Dg]. 

£g)<5«S7X!J_/_}/_/«0Z_u5)<si> ^)<osf)si/0LD 61/ (Vjl_rh] <95(OTj<95<95/7(537 LCiddiSYT Q^rTSS)3, S56Stffl/_//_/ 

/ 5 «r>/_(a)/_//ryLO. ^j^jQsu batch QYOCeSSiYlQ-d(&) dljo[ 5^5 2-3,nrjsmLDna s&ildilild. 
^jiufB^lrj suLfld^jDfosSlsv 2-snsn uevGsurry G>&mdurT®3;6rT u/nifliLiLb <3]<sujoj^hsiruL^ 

<3I6&)LD[53, uevQsur^j GULfttLpsmfn&srr (algorithmns) ujoj^hLiLD ^jesflsuQ^LD 

s> /_i_ (/>)<S5)/js> srilsv sirrsmsvmd. 
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7. Statistical Learning 


Ushsifl S)S]sufjthJS,ss)snd Qarrsm® air)u(]>{5 ^iur5$5lrj siJLfldgijDjDGSlsiT<s>]LS)-Lju6v>i_. Grpa, 
tspQij &6tfsflu/_/id ^/je^aimnai eifldaiuu (/»)id Lishsrrl edUeuudiaierrlsiT ^ipuusmi—idilGsvGiuj 

<3)6$)LdQ(d&)1. <^\^^ss)3,uj Lisheifl <3)S]sufjth]3,ss)sn^ §)ljr)LDu l _ d ss)3,iurrsm(^] aswsflsisfld^d 

aijn^jd QarrQuug)] <otuunf. <o7w/py ^juu^^luSlsv asnsmsvmd. ^j^jQsu Statistical 
learning model <o7w/ry ^&s)Lpdauju(^iid. 


Domain set: 2_<syrafz_/7<s<g <50®«$77D Limsrfl s)S\sufjthJ3,Qsn ^sijsurTQj 
< 3 iG$)Lpd&,uu($)Lb. x={.} (oTsiru^i domain set / instance space 

<6T6SlUU(^lld. (jdi&lSOIGfTGfT SpsijQsurTCJh Si<oSfldsi<oSfl stfiSU fTfJJjld domain pOmtS / 

instances srspiid Quiuflsv < 3 i 6 v>Lpda;uu(jdid. 


JC — %\) X2' ..... 


£—&)rrrjfoiffir&y&ii<&>(&) 6^0 1000 uda 5 (3^/7L(5)l7l/<5<5<*<5^)<sot <5 )S\ss)(Mss)UJ sjsijsusns^ 

6$)6uda><sonid <o 7 w <s ^0 algorithmn QpsoLb ai&ssfhju^\^isuss)rj fpnid sidssisv 
rilrfsmiiSl^^shsn Qp,md(^]un^s,djs,sfilssT ud&di&sn $)a,iD(&) 2-6rT6i?i_rrai 
^sdid3,UU(d)S]SSTfDSST. 


X = [10, 50, 150. 600, 800] 


Label set: G]6U6ffludi_rTa> surr Qsusmipiu s^snrjdis^ssxsn Qu/oj^Q^di^id. Y — 

{.} . g_syrsyff_fT«s ^mdSleirn prrsij^m srdQpda, Gustnaudlsir dip ^osuDiLjid <oT<s$)iih 

LD§>1 U L/a>6YT $}§>l6V amSSSTUU (J) id. <g£]d,&,6$)&UJ 6)5) 6U IJIhl <95 68)6fTa, 3,[J 2_<5<51/L/610 'domain 

expert' <o7w/rjv ^6S)ipds,iju(^]6)jrTij. 
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y = y 1» V2. Vm 


2-6YT6r?L-LC)-&V n,nid ^6lfi<5<5 (3^/7L_(J) Ll /_/<5<5<95/E/<956yfi(SOT S)SlsS)S03,Sn ^j!hJ(3j 3irTSmUU ($) 70 . 


Y = [50, 95, 250. 750, 999] 


Mapping' function: QLDfD3^.f6luj £g)/7653765)/_i7/70 ss)Su^^idQs>nsm(^i 2_<5rr®i?iJ-®d(& ) Lb 
QsueTrhidLd(^)d(^Lb £g)65)/_(3t7/u_//7657 Qf5m_r}®nu efTfou^l^^jid Qeuesievsvuu mapping 

function (f) Ginulidifo^j. ^^ssxssi 6&><su&>§]&> <5/7637 [bld^j algorithm /570(7p63>/_/7/ 

<g/ 7 <si/.a«n situ u'infold ajDfQjd Qamsndling]. 

f : x -> y 

f : 10 /_/* <9>/ZL/<9> 6Y7 “> 50 (fjju null; 50 udarhiasrr-> 95 (fjjunull ; 150 uds>dis>sn -> 
250 ffiju null _ 

Probability Distribution: / 5/770 Gls>n(^iddlssTfD LDn^lfl^ < 5 / 76 i/* 6 )T urjsusvnm 

<316v>lduj (3susm(^)L£i. QsuQjid sprflrjsm® < 5 / 761 /*63 >sn /_ol_(J)z_o 

Q*/7(p<50/6l5)/_!_(/i) <95633fl7jz_/<956)T/#<S7p<5<5<95 <*l/_/70/. <5/7/76357.50/<5 (0 1 0 7_/<95<S70 ^ (/J)<50/ 

500 udgnd) <oTssiiLSlnsmfh) z_/<50<*<5^)657 6i5)63><sD63)t7/ LDid($)Lb Q<*/7(/J)<50/ 6i5]l_(J), ^li^-Qfjssr 
1000 /_/<*<* Ljd>3)3id)§)lsir 6i JlsmGveimu <*63 Tsfldmd Q&nmsmnsv, <s>1[5f5d mswsfluLi 06i//D/7<s<5 

0/7657 g£i(rjjd@LC>. £g) 0 / <spij6n«)jd(& ) nfhunm £g) 0 <s* Qsusm^Lcmsmnsv, n,nid uuSlrodl 
^errldmu uunssruf^l^^iid 0 / 761 /<* 6 ) 7 / 76370 / drjnm (LpssirouSlsb urjsusvnm < 3 ]g$)Lduj 

(3 61 /6337(/j) TO. ^9/0/761/0/10, 50, 150 _ 67637 drjnSST (Lp SS) fQ uSl SO UsbGsUjQl 

l/00*/5/*6)$6S7 t_/**/5/*(6)5to, ^fBJD&rrm S)S\ss)(M3,^n,Lb Q*/7(/J)**/j/_//_ 

(36i /6337 (j) to. ^) 0 /( 361 / probability distribution <mssruu(d)Lb. ^^ssnspuussti—udieb 

67(/>)**ij/_/(/J)T0 (LpiifQsu */$//_//76370/7* ^(j^d(^Lb. 
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Sample data: pmb ^levnuLiLC) umSfifiig ^tjsiis,Qsn sample data 

^jsvsv^j training data <ST<s5nju(^iLb. ^^rrrjstitir^^idi^ / 5 /_d/jS)z_z_o 500 L/d)dj<3>iii£h@i?l(o$r 

ud&djguQtyLb, ^^fD3,rrssT <3)S]ss)(M3,^n)Lb ^iQ^dQroQ^ssflsb, ^ss)su ^ss)Ssr^ss)^iLjLb 

Q&rrQtB&jij ULpd&rrLDsb, 0 - 50 ud&dj&sh Q&msmL- 

tsp(l]j Lf^^3,^^l<StiT <sS}<SS)6V LLfD^jld 50 - 100 Ud <S rhJ <95SIT Q <95 rT&tiSTL- L/<9j<5 a^SflsSl(tyfb&J 
LDfoQfDrrq 5 n^^^lssr aS)<smso <oT&tiTU§] Qumssrru rsr tld G>3,r}r5Qg,(jf)JB§] j/gj/UL/iii 

LDrr&dfid) g,rj<sya;Q syt Sample data <oT<ssiuu(^iLti. 


Learner: prod Gi&rfdQ&,Ch)d&>i ^stitiuiShLjsnsn LDndirfld ^usyansrflstiT ^L^uustiii^uSlsv 
/_/< 9 j<g<55/57<95 <sfflsir sfihs&xsvemu rffrfsmuSluu^j u/odfluj ^pSIssxsu [bld^i algorithm- 
suenrf^^jd Glairrshdljo^i. ^sdsurTrn s>[D^]d Q&rrsmL- algorithm-^ ssi^i learner 
toTsir^j <3i6$)Lpd&iju($)&b(D3)i. (A(S) — Algorithm of sample data) 

A (S) 


S = X xY 


S = («i»Vi)» (»2> Vs)- * • 



Predictor: Learner susrrrf^^jd QarrsmL- ^tYlsiSlstii ^Lpeoid, label Gl^iLnuuui^rr^ 

li^Iuj dlsv L/sneffl <s£l<surjd3><sn sutr^idQurr^j ^sujDssijDGhusvsvrrLC) errsQ^rB^ Iabel-<S57 dLp 

^ssiLDdansvnLC) <otssi ^sssfhju^i Predictor <oTssruu(^iLb. ^)^/(3si/hypothesis / 
classifier <st gn//i> usvQsurn Quiurf^sfflsv s^ss>{pds>uu(^iS]<diiDssi. (h = 
hypothesis). ^(Bneustf ^fBl&uid&LDrra, 800 u da dash Glanemi— Lid,&,a>{£§)\6$i 

sfilstiXSV SUStillJ Z_ 0 /_!_(J)LD <g/ 7 W / 5 /_ 0 <S @<5 Gi&ilflU-lQLD(StifleV, <3]{5 JD(&) GlDSV udadaSTT 2-UJtJ 

2_iurj <3i{5<stir <sfi}<s5)<sv<ss)UJ (srsijsusne ly ssxsudaeonLb <oT<s5iu<ss)&, predictor astitifl^^jd 
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h : x -> y 


Validation data: spqrj predictor-<SOT ®<oStffl/j/_/ ®/$u_//t® ^snsn^rr <oTssr Qgn^luu^i 
observation Lb, ^rrs^^sn validation data <o7w/ryLo 

^smLfid&uuQLb. spQTj Id/ofig, predictor-^ (S^rfsi^ Q&iuGug,fn(& ) (^ss)jDfi^uidgid 30 
observation-^,^ Q^ss)su. ^^itgu^j rsm-t 500 sample data-«nsi/<a «r>®u 5 )<si> 
ssxsufi^lsnQsnnid < 57 <sirjDnev, ^ss)eu ^3iss)Ssifiss)^iLiLd Qg>n(^ifi^i learner-®?® 
&jDiS)dg;rTLD6V, Qsuruid 300 ^<jsi^g,ss)sn und(fi]id Qg,rr(fi)fi§j g,d)iSldg, GsusmQid. 

iSlsirsKrrja/niryd QgurQfig algorithm Qpsvid lS^^sytsyt 200 <5 rjsi]a(Qnjdg>nssi 
sfilsiftGOsmu g>sstd\dg>d Qgnevev Qsusm^id. ^siisunir)] <sp(rjj algorithm pfiliuna 
aiswsflddljD^rr ^evs^sviurr (ormueng, Q^rr^luu^jD^ 2_<5<si//_o <31 fig, 200 <5/761/® Got 

validation data ^Tssruu(^id. Qug^isugg, sample data-si5]«j7 25 % 
validation data-^® ^s^LoiLjid. 


LOSS / Risk: asttfluL-i <37LjQun^jLb 100% gfiliung, ^ss)LDUjn^i. < 3 i&i] si/mry 
< 3 l 6 v>LDGug)] g,<surriJLb g^L _predictor Qp&vid <or(^idg,iju(^iid g,sssfluLi g-smemmuiTSKr 

ld$bI/_}/_//_ sir <oTifig, <3i6YT<sii sdi) dig, finish Qsuruu(^)d]jD^j <oTG$ruG$)g,d g,6mddid($)d ®wry6i/G<5 

risk < 3 ig,msug)] pnid validation data-gc>® Qg>n<dfi^i Ggn^td^id Qurr^i, [Bid 

«0®u5 )<SV 2-6iT6n 2_ 633765)/_OU_//7637 LD^IULld(^Li), <3)g,G$T ®655fl/j/_/®(g/_0 2—GYTGYT G6i//ry/_//7Gz_ 

'£g) LpuL-i 1 <5rmuu(fi)Lb. 2_g,mjGmfig ) id(& ) 850 udg,ihjg,sn Q®/765k/_ 
G/5/7i_!_(p/j/_/<5<5®<5^)(oOT ®S)g$)GV 1200 ^urriii <oTssr rF,LDd(^) ^jrog,ssr(Ssu Qgiffignsoiid, <s£0 
algorithm ^LpsoLb g,sss?idg>uu<dLb gi&ssfluu 1190 ^njumu ^jsveo^j 1210 

(Wj umu <oTG$T(T)]g,[TG$T £g)(/5®(g/_D. (ojGl&fT&sflsV 61/651/7^9/^/ ®/D/ry®(o)®/76557/_ 
/_/<5<5®/5/®6yfl657 GdHsmGvemu «n sufi^j (3g,mjmuLDrrg, 6^0 stiles) go snujd ®655fl®0/_oG/_//70/, 

<31$] £g) 61/6i//7/ry < 5/7657 £g)0®0/_o. £g)0/Gei/ g : rfUurrm (Lpss)(DU-jid g^L _£g)/5<5 ®(J)®<si76i/ 

G 6 i//ry/_// 7 (/J) £g) < 75/5 < 5 / 7 <si> < 5 / 7657 , [bld^j algorithm ®/f?u_// 7 ® Ggus^gv QgiudiD^i gtgsiqj 
< 31 ljfig,Lb. ®S)g$) 1 _g$)UJ i£\g,fi gjjsvsShuLDrrgid Glg,rr(fi)fig,rr<sb, ^fjsi^g,ss)snd 
g,(DQjdGlg,rrsm(fi] a&ssfluL/ rff&Lpfig,rTLD<sv, LDG5Tijum_id Qg : iu^j Glg,rrsm(fii s^uiSlddljo^] 

gtgstQjd 3 irjfig,LD (£g)<565)657/j ujdj^J Over-fitting -ev guTsmoomd). ^j^jQugsirjD Risk- 
®? ^6n<s)d(B)<su^ieii 2 uL^gush ^snsnssr. (Lp^eSieb true risk 3 i($>fi^i empirical risk. 
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True Risk : 1200 (Wjuniu ld^Iuli Qarrsmi— QrBnid-^ULi^^ai^^lGsr siSlssxsoujnssi^] 

1190 07_//7t/j GTGST &>G$tifld&>Lj U (7)1 LD)(1> U (Tdil, ^]GS>UudGVIGYTGn 1 0 07_//7i/j GTGST U^J^rTGST 

2_<s3k<o5)i_Dujf7<o37 risk. £g)<565)637 generalization error gtgstpqild g^roiGuij. 

QurT^]UUGS)l_UJrTGST G^GSTGSIJD 2_061//7<*061/<g/7<si) <oJri)U($)§>)GST(D lSlGS)\J) GTGSTU QUrTQTjGYT. 
^gst(tgv £g)0/ Qurrmro iSlGSiLpaiGSiGn g^guGIgutto^ ^rrs^fod^Lb ^Gsfl^^Gsfliurr^d 

s>GssrdS]id(^id a^^jGu^fot^ utslGvna ^gs)Gds>gttIgst &rjn&!fhunGsr empirical risk gtgstjd 
G pGSTPQ] 3iGmui^ujLju(^diiD^j. R(h) gtgstu^j Risk of hypothesis <^070. 

<^<5/761/0/ &>GSsfhjlSlGST ^LpGVLD GT(^lda>UUlLu Q61/6)$//?(£)77) h(x), mapping ^LpGVLD 

syGSlLDlU QgUGSSTL^UJ 2-GSSTGS)LDlU (TGST Q 61/ GTTluS '($) LD f(x) <^70 LD /7<95 £g) GVGVTTg, UlL&d,§d<sb, 

6^0 Risk gtgstugsi^Quj dipda>GSSTU (fftjdjddijLb a^^jdliD^j. 


H(/v) = L(fc(* 4 ),/(*,)) 

= M*0 # /(* i) 


Empirical risk: QarT^GS)Gsrds>rTs> rsmi) 200 /_/<5<5<s/6/<s65)6y7 ^G/Huu^rT3i GsiGu^^jd 

QgrrTGmurrGV, ^gs>gu GpGuQGurrGsrrSlGsr 2 _GSSTGS)LDUjrrGST 6i5)<s5)<si) <95070 - &Gssfldg;uu(B)d)GSTjD 

GlOl GS) GV d (&j LD (T GST (361/ffy U(TUGS) /_<95 S,GSSTurf)l[53)], ^SUJD<SS>JD ^rjTT^lfl GT^LJU^GST (SfiGVLD 
^GSlGSTLBgjd) 0/761/<95 (60 <95 070/7637 G>0/7/7/7t7/70/7657 HSk GpGSTGSl/D <3]GS)LDd <55(577/770. £g)0/(3si/ 

empirical risk ^ 0 to. $)( 5 gst ^lpgvld 3 >Gssf]da>uu(^iS}GSTfD L/^^3irhi3iGrflGST 

6l5)<55) GVILI(T <5570/ G>d,rrrjmU LD TTSi £g)61/61/67761/ 07J/7/U QgUQJU rTUL^SV <55)70 TB^lQ^d^LD GTGST 

T5LDLDITGV GUGS)(JUJrU^^jd &tl/D (LpL^lLjLD. 


1 m 

-Remp (^) = ^ ^ ^ ffoi) 

m 


Empirical Risk Minimization : [bld^j G&rrpGVGxrd&rra; uGoGeug] algorithms 

Q&TTGmQd 2_061//7<®<95Tj/JTlz_ UGoQgUQJ predictor- KGlflGV SdjD[53}GS)g,d 3,GSSTU(£]UJ 
validation data-65)6i/<95 Q<95/7(/J)00/ 6£6i/Osi//70 aGssfluurTGSiiLb ^rmujuu^dlfo^i. 

UGvQGuru observations (jrpsuTO gsgdQgij(tgstdI svi ld ^Gnm ^Lpui3<dr arjrraiflujrTGST^i 
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&sm®iS)Li)-d&Lju(b)!&)jDg)]. £g)<g<5<o5)<*uj ^rrrr^ifl £g )LpuiShsir LD^iuLiai&iflsv <oT&,<5ini&r>i_uj 

ld^iuli l£)&& (&)6$)[DGuna, 2 _<sy 7 w( 3 < 5 / 7 ^{f,6$)6$i3, asm® iSIl^u uGa, Empirical Risk 
Minimization <orssnju(^Lb. ^^ro^rrm (^^^njLb z_S)<oOT 6 L/ 0 z_D/ 77 ry. ^j^isv arg min 
< 5 t&stu 3 }J argument of minimum <otg$tu Qunq^snu^Lb. ^jsijsun^j 
asmi—rfliLujuL-L- ld^iulS\ss)ssi, spsijQsunQ ]5 algorithm- z_d {f,m&Qo><s$iQ[D 
Q&rrsmQmm s^^sv parameters Qarrsm® ^GfTG&QQrD&j. ^i^esr lSIsstsstQij 

^r5d,d 3 ,smfhjiSlss)SST <orsijeusns ly gflijLb rBihusvrTLb, ^j^emrrsv 3 ,Gmf\ds,iju($)Lb ld^Iuli 
<0T/5<5 ^isv&ShuLDrrg, ^(rijd^Lb srsirug] QurrsiriT) G^si^iurhJ^QsnsvsvrTLb 

s>smdS]id(^id g™.rri]&)[D 3 )i. <g£\<s$) 3 ,u u/Dffl PAC Model-si) s>nsmsonLb. 
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8. Probably Approximately Correct 
(PAC Method) 


<s£0 &®ssfluurrsir ^LpsvLb rffsiLp^^uu^ud 3i65sfluLi <3rsij<susns ly 0 ijijlL 3 : fiujrr 6513,33, 
$(r^d(^Ld, < 3 )&)G$)G$T (oTsususnsif 3)ij[jLb rBLbusvmL 6T65TU3^ (SurrsirjD s^s^iurhjaush <oT6V<svrrLb 
$[ 53 , (Lpss)(DuSlsb a,GmdQi_iju($)Q(D3)i. (Lp^&Slsv s ^0 3,smf]uurTssf]ssT 3i65sfluLi 

probably approximately correct -<^ 3 , ^s^ldsu^jd^ ^sutDf£i<sb (otsstQsstsstsst 

usmLiaiQm&VGvrTLb £g) 0 ®<® (Severn® ld eresrues>3, tspq^dlsv 6V6S)tjuj6S)(D3,6n Q&rrsm® 

G>&rT$5ld!d)rD3}i. s^^rrsu^] Over-fitting $6V6V3LD6V $(ijd§()fD3,rr, inductive bias 

Qu/DgU Sl5)(5Y7'/6y0® i ®^/7 / i.i.d (Lfi65)JDlldl6V UlfiljB&JB 3,rj6lJ3,6n 

^errldauuLd® 63633,3,^3,651 Sample Complexity 6T6V6V636^ $(r^[B^36V, asvsfJuLi 

6prj6n«yd(& ) ®/$u_/f7® ^/<o5)UDtiy/_D ereiru^i QunsirjT) Q^rrdQsb 6T6V6V3id ^j } rT3iuuu®®rD^j. 

iSisir< o5t[ f accuracy LD/o/ry/i) confidence parameters ^pevid [bld^j aewsflun 

<oTsijeusns ly girrjid gjsvsShuLD36513^1 6T65TU65)3,d aismddl®d]jD^]. $Ld(Lp65)jDudl6V 

realizability assumption er&nud ^i65)iLD365nd 3,3655Tuu®id. ^ssmeb $^j [Brnb 
3,3655TuGu3(3jid Agnostic PAC Model- sb firhiSlefilQlLd. $di(3j 

(3)i]5hjiShd(®6363 s^siiGlsurrssTpSlm 6)S)63d3,{£6S)3,iLjLd d(Sip 3i3655rsv3Ld. 

Overfitting: <s£ 0 £f )<sv U}3®ifld, 3,rr6ij3,65)63d G)3>3 ®^^i learner- ®£>lv UL^danLDSv, 

6pLd®G)LD3d,3,LD33; <3>j6$)651 {£§]&> 3, rj6lj3,65)633!id Q3,3® <50 /LJ ULfid®65T36V OVerfittmg 
6T65TJD tSyumULC) 6J(f)UL- 6U3UJULj 2-63633)]. $]6U6)J 33}] <3]6361]d(3) <3]®3,LD33,{£ 
3,rj6i]3,65>63ij QujDQ]dG]s>rT6n(^ff)Lb learner-^z,<S370z ajnrrijdQarTsrTsn (Lpiujod) 

Q&LuiurrLCisv, &rsvu ld 33 , ld 65T u u 3 1 _ id Q^iu^ieid ®dlfo^i . G>3 : 33,65)65tu 51 eirGiu 33)] ld, [B3id 

6T®rju3rjdd>l65rjD ld®uiS165)65t^ gisvsShuLDrTa ^sifldSleirjD^j. sshsn risk-sar 

ld§?IUL i GTLiQurr&jLb 0<o5)ff)G(5i/. <3]&>65rrrQ6tiQuj $165)3, tspQi) &rfhurT65T g,655fhjurrg, 6T®d,3)]d 

0)3,36363 (Lpupiu3^]. ejQesresdeb uuSlfodhiSleir Qurr^j <3]6dld3,uui_33, l]&>Iuj ^fj6)]3,^3,d(^ 

$3,65T36V (Lp65)3)31IT3> 3,65sflLJ lSI65)65T /#®£P00 (Lp 3133)]. ^Z,®Osi/ $[53, Overfitting- 
®£> $6V6V3LD6V 0)331633,3)3,33, 2-6363Q3, indUCtive biaS ^,0Z_D. 
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Inductive bias: hypothesis class stsstu^j umdjifld, ^rjs^gis/rlsb 2 _sy 7 w 
spsijQsurrsirssijDiLiLb <oTir>Qg,!T>g, label- siy/_ssr (pssijDuu^l^^ld &fna (Ssusm^id stsstjd 
Q^nurfiSlssissr <sidsnd(^dljD^j. £g)0/(3si/ Inductive bias ^0 Lb. biased <sT<ssrjDn<sv 
Qsmjflsvsmd &rrrjr 5 $(njuug J j srssr/ry Qun(nj<sn. ^jLb(pss>jDu51eb leamer-^ssTg/, 

hypothesis class-®i> gsLjouuQdlssrjn Q^nurfi-igis/rlssr ^Lpuussiuudlsb, ^rj<s^g,ssxsnu 

ufDjfjhu ^djhssxsu sumrjd,a^dQg,nsrrdlp. ^susunru QujrjirydQg,a emu 

^djhsidssnpuussiuudlsb gisssfluiSlssissr Tflap^g^GuGig, inductive bias 
<oTSST(DSS)Lpd3,iju(^]Q(Dff ) j. £g)0/(3si/ &rfhurrm (ptssiroipid 

Hypothesis Class: 90 learner-inductive bias-dg/0 ^^d^mrrrij 
<3i<sisu£>d& 2_<g6i/6i/0/ hypothesis class ^(^ld. £g)<5sssssr finite & infinite srssr/ry 

^jijsm(d) <suss)g,ujng,u iSlrfld&svnLb. Hypothesis stsstussi^ ^lSIl^Isv ^(r^^jQginsh <oTssrd 
Q&rrsvsvevmd. istsstQsstsstsst gnsssfluipgierrlssT dip 2 -<srr<sf?($)g,<srT ^(rijd^Lb stssiild 
sussi ijiussijossnud Q&>rr(j)\d, 3 )i, ^i^ssrdp g,sssf)dg,d Qg<nsbsvi<sug,i finite hypothesis 
class. 2-g,nrj6mJ5g f ]d(& ) youtube-si) login Q&iuiLiLb spi^surj g>nssxsou 5 }<sb uddflu 

uni pi ld , ldssisvuSIsv (j^ssi emu nn<sx>n unupid Q^nurfddhungi Qg,d($)d 

Q&nemiptnjddlinnrj (oTtssflsv, ^sutajdanssr hypothesis class ud^lu unuev 
LCtjDfj}]id ^jssxsmurjn^n unusv STgo/fi) ^tjssstQi svssi gndhsir dp ^ssiisnpd. ^g,&si6ST 

finite hypothesis claSS-<*0 2-g,mj6ssm}ng>d Qgn<sb<so<sond. ^ssrnsb 
LDjf)QjDn(r^su(Srjn ST^g, sussigudhsir dp ^9/si/ 0«n/_uj rj&ssxssr £g)0<*0/_D <stsst 

suss)fjiu^jdg,Qsu (piptung, ^snpd(^, g;ng,<5V, udtsl, n,<ss)g,dgr<ssxsij, g<smssiu, rBussrd, 
(^prBssi^u unusvansYT <stsst usvQsuru sussi giuSl&Slq^rB^] Lunind)) Lun/Ddjhj unrfddljDnrj. 
<0T<SS1Q<01J, c 9/si/0<®<g5/7S37 hypothesis class- <sv ^si/si/snsi/ si/ ssi .a.a sit g,n<ssr ^(iTjdi&fd 
<oTsst GU6s>!jujrrydg;(j><su (pipiuna, uip fism® Qg,nsssrQu Qgsbpjd. ^jssi^Qiu infinite 
hypothesis claSS-<s0 2_<5 nrjsssrLDngid Qgn<sb<so<sond. 


Sample complexity : LDn^lfl^ ^rjpg,<sdlssr srsssrsssfldssig, i£]g,pd (^ssijdtb^j 
^IQ^rB^nQsvn s^svsv^] ^snpd^ ^^Ignungi ^(r^rB^nQsvn ansssfluLi pfiiung; 

tf, ss) uQujon^]. istsstQsu s^0 dosssfluunssflsir QsujDf^humssT^j <3ig,fD(d) ld n$sl rfliung, 
Qg,n($)dg,iju(Tj)Q<ssT(D ^upgnsrUssr <oT<smsssfldssigissnuij Glunru^Qg, ^ssiwQro^i. 
Qg,nrjntuning, srsi/si/snsiy mndilfilg ^rjpgnsrT Qg,n(^]^^nsb, ^jpesiissiuiu gusssfluLj 
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s^/jwsi/*® pfliuna; ^(rTjd( 3 )LD <orssid <*L/rys uGa, Sample Complexity id . 


S ~D m 


S similarity D to the power of m <oTssru^j<sb m 


LDIT§)1 filILIIT3i <oT@dg;UU®LD ^fJSl^a,sdiSST <oTSmsmf\dsS)3, ^(gjlD. <3][53) <oTSmSK$fld&r>a&r>Ujd 

*<o5OT*®/_ 2 _ 06 i//_o dlss)srr^ Gib/d/old iShsinsuQ^iDaru. 


' log \H\/6 ■ 

€ 


$J3)JD(3) ^sfflda>uu(^iiD LDrr^Irfl^ ^tjs^s>snnssi^i i.i.d (qTssiiid ^is^iild n&srddl<ssr sul^IGuj 

pt-dSirogj. i.i.d independently identically distributed ot<s%/ 

QurTQ^sh. <5£G$rQ(DrrQi_rrG$T(rii ■* rrrju/D/D ^ssfl^^ssflmnssr a,rjsi] LDa^lrflaisnen <oT(h)d&6$)iuiS) 
learner-*®* ajDiShju6&)a,Guj £g)0/ su sShL/^^^jdifD^j. 

Realilzability assumption: paiD <ojd)a>ssiGsij &><smL_ 2_a,mj6md,S>\<sb, 

/_/<5<5<*/Ey*6yfl<o37 ud&dj&srr ^^laifl^^rrsv ^&g$)IG$)I—UJ sd&sxsviLjiD ££)<*/$ <*®z_o <oT<ss)iid 
^mss)iLDrTssidd\ss)ssi [bld^i algorithm suerTij^§]d Qs>rrsnd]jD§]. ^j^jGsu 
realizability assumption <orssiuu(diiD. ^ smash (^jda, ^ismiipasmiD sr&VGvrr 
<su 6$) &> uj it stir * <o8tffl/_j/_/* ® ld Q u /70/50/T07 . 2_3 ) rrijGm{B3)id(& ) LD,rr6mujf£6$)3) arsmiiB 
<si5)zJ_/_/7<si>, a,sv)GV sd(LgiDa l £ sd(LgiDa <o7w/_/0/D0 st[b£ 5 <s^(r^ ^ismiiDassuLpiD Q&iuuj 
(LpL^iurr^]. ^a^Gunm/D rffsmsviujDir) ^ssrsmiDsmiud (^r^ld^iD a,smfhjLja,ss)snij urorSl 

Agnostic PAC model- <sv aarsmsvmD. 

Accuracy parameter: ^0 predictor/classifier-«fr ld^iul/ stsijgijsyts^ ^fjid 
0 Isv&ShuLDrrai ^q^d^iD srsirusm^d (^rfida, G stsvild (ajpSlud® uiusiru^dljo^] . 

67 wG>si/R(h)>G <oT 6 $iu 3 )i <s£0 a>ssts?luun<ssflssi Q^nevsidiurrais^LD, R(h)< = G stssiu^i 
G^mjmuiDaa, <s£0 ibsvsv aiswsflu u asmaaisi^ id <oT(/J)00/* Qa^rrsnsnuu^Qrogj 

Confidence parameter: delta ld^\ui3sst < 3 n±uu&vi_uSi&v 

(&,rf)ld * u u (])i drogj . 
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/ 1 } h(Xi) = /(«*) 

l °> M*<) ^ /(**) 

gmb (oT^lrfurrijLju^iLb, ans^sfluumsir (oT^l^^jd^ggeu^Lb pfiliLmg; ^\(j^uu^(D3,rrssT 

/# <95 If) <5 <95 61/ 1 <5T SfiT 61/ LD , <561//D/7<95 <J>/65) 1061/<5 7) <95/7637 /# <95 If) <5 <95 61/ 0 6765761/10 
Q&rrsrTGnijuQiiQrD&j. ^j^sir ^l^uus^i^uSIsv urrrf^^rrsv 1 <oTsstu&j ^\fjsm(^ib <97_oio/7<95 

65) loei/< 5 //) <95/7657 /#<95if)<g<956i/ (oTssid Q&nemi—rrsv, 1-6 toTssiu^] ^smss)L£>ujnssi 
gusvtifluLdsinm (orQiB&jd&sLtDU (Si_//7g/io/7657<5/7<95 ^jsvsvrrLDSv c g>/65>i06i/<5 i //5<95/765i /£)<95ip<g<95 6 i/ 

^@10. ^dfjGsiJ £g)<5<565)<95il/ LDfT^I$]3,SS)Sn 6761/61/6)761/ StffJLb [BLbuSVrTLb <oT6$TUG$)&)d 

(grfld^Lb confidence parameter (1-6 ) ^,@ 10 . 


<3i(j)\d,(f,{f,rra, ^)^/6i/«f>/7 /5/710 ajBrrudQansmL- siSls^iUjdi^ssxsn ss)su^§] simple linear 
regression-^* ^(r^surrd^su^] stliul q. <oTSSTgy urrijdaisvrTLb. 
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9. Linear Regression 


9.1 Simple Linear Regression 


Simple Linear <oT<ssiu^i ^jiurB^IrT suL^ld a/D/BsSlsv ^sytgyt epQTj ^Ls^uuss)i_ujnssi 
algorithm ^ 0 /i). ^rj<sm(^\ <sfil<surjrhi&<sh (oTsiisurT^j Gl^m_rfi-I 

u(^i^uu(^iS}^i[DS5i, algorithm OTsi/si/mp/ l//$< 5 < 55 )<su GLDjnQairTGhdlrD^j, 

[5&)ij Llfildj&V <oT[f>3, <3]GfT6H&(&, ^fllUrTSi ^GYTGYT^] (oTGSTU&J GumsiriB 
<sfl<sy,ujth]gi6is)mQujGV<5vrTLb spqij&GV ^fjsi^s,ss)sn ssxsu^^i Q&iu6V(Lp<ss)jDu5l6V Q&uj&j 

umjdff.u Gurr^Gfomd. 2 _<$ rrrjsmd,a^d^ 6^0 iSlid&rrGiSlm ^sn&Slssxssrd Q&rrsm® ^3>S5T 
G)S\gS)(MGS)UJ <5T(oilGUITjQ] $tjsmuSkj L/0/ <oTG$T £g) UUuSl6V 3HTGm<SVmb. £g) #i/(51/(55)/7 
rBLCuSlL (LpGYTGYT ^GVXoST^dj] lS\L-3 1 fT<o)S)6$T ^GYTGljLD, 77(557 (5)5](55)(51)<S(SYJLD X LDJDQJLCiY 

variable-si) GrQpgjdQ&rrmm (Ssusm^m. ^§]G>gu label set ldjd^jld domain 
set ^^ld. 


x= [6,8,10,14,18,21] 

y= [7,9,13,17.5,18,24] 

ueoGsuru iSlLL&msfihsir (in inch) Ql///5//5)0<s0/-O X <oT<ssiu^i 

explanatory variable gtgstgii id, si/ 77 t 5) gn/ ga )l _ m Gfihsm<sv&<5inmd(iYi dollar) 

Q< 9577 (S 5 kt^_ 0 <S 0 LO Y <o 7 < 557 L 0 / TCSpOnSe Variable <57(537(51//_£) ^jSSlLfidanju^ld. Ll<sh<sifl 

(5l5)(Sl//7/5/<S(S)7/7<95 £g)0<950/_D £g) (Sl/fl) <551© QQT, (51/<55)/7/-//_/-0/7<S (51/<S5)/7/5^/ UrTrjuGumd. 

^uGun^j^neir ^6S)<su Q^sysviidGund^ rBLDdigd) Q^rfliLiid. matplotlib gtssiu^i 

(51/<S5)/7L//_/67<S<S5)(S)7 Sl/<55)/7^0/<95/7L_/_ 2_<g(51//_D S^0 library ^,0/_D. dg)@(Sp/(577(5)7 pyplot 
^LpGVld [BLDg] L/(S)7(S ifl G)S]GUIjdl3,^n,ds,rrGST (51/ <570 TJU l_ld (51/(55) TJUJ U U id (film mg]. ^JBJD&rT&ST 
rffijGV iSlsirGiiQijLDniT)]. 


https ://gist. github. com/nithvadurai 8 7 / 
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cb77831526033da63be0790f917efe63 


import matplotlib.pyplot as pit 

x=[[6], [8], [10],[14],[18],[21]] 
y=[[7], [9],[13],[17.5],[18],[24]] 

pit.figure() 

pit.title( 1 Pizza price statistics') 

pit.xlabel( 1 Diameter (inches)') 

pit.ylabel('Price (dollars)') 

pit.plot(x,y,'.' ) 

pit.axis([0,25,0,25]) 

pit.grid(True) 

pit.show() 


££]§] QsiiGfihjuQfB&jQmfD su6$)ijui_Lb iShsirsuQ^LDrTQj ^(tyd^Lb. 
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a Figure 1 




Q 5 ^ B 


Pizza price statistics 



Diameter (inches) 




£g)/ 5<5 <5u®s)rjui_J5§?iGV l5)l_<?/7isi5)<sot si5]/_!_/_<5^/<s@/_d / ^^sst efihsvHSvd^LSlsini—Guj 
(UrsrtLDrTjDGV Q^m^rjLi ^)0uuai^« arrsmevrrLb. ^^rrGU&j spsirffiUdT ld^Iul/ < 3 jtslarfldg; 
< 3 /^)<s^<s<s LD(f)Q(DrrG$Ttr)]Lb <oTG$tuQ&} ( 3 / 5 /jld/t/d si). £g)/EV 0 z_o ^uuLq.^^rrssr 

^sytsyt^j. <3i(J)i{53)i £g)<o5)<5 ssisu^^j 6p(i]j algorithlIl-<950<* a/nayd QarTQuugjjoarTSKr 
rfflJSV Z_5)<o5r6L/0Z_D/77QV. 


https ://gist. github. com/nithyadurai8 7 / 

d94507f9052a6120dce5f20e31806cea 
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import matplotlib.pyplot as pit 

from sklearn.linear_model import LinearRegression 

x = [[6], [8], [10], [14], [18]] 

Y = [[7], [9], [13], [17.5], [18]] 
model = LinearRegression() 
model.fit(x,y) 

pit.figure() 

pit.title( 1 Pizza price statistics') 

pit.xlabel( 1 Diameter (inches)') 

pit.ylabel('Price (dollars)') 

pit.plot(x,y,'. ') 

pit.plot(x,model.predict(x),'- -') 

pit.axis([0,25,0,25]) 

pit.grid(True) 

pit.show() 

print ("Predicted price = ", model.predict([[21]])) 


f&iTGvid&iTGsr G&md&LBi sklearn (otssiu^i usvCSeurru GUGn&ujnssi algorithms- 
Q<g5/T(o3k/_ package ^0 lo. linear model library -sb s-otw 

LinearRegressionQ class-^issr^/import Q&LLnuuuQ&i/D&ii. 

■sios^/j/jirasr/predictor ^0 ld . £g)<5/D0 [bld^j ^rjs^a>ssxsnu ujnjfld &>rorQ]d 

QarT®uud,iBg;ng; fit ( ) <oT<gy/_o method uiussru(^i^^uuiL(^isnsn^i. iS1ib(&, [bld^j 
Model <oT6i/6i//77ry arorryd Qs>rrsm(^isn<sn^i <oTssruss 0 pyplot yysoid Gusmrjui—Lb 

suss)ijrB^l s>nidi_uuLd(^isn<sn^i. &> 6 $)i_§}ujn&> predict() <oTgn//i> function, [bld^j 
model-«sr lS^j QaiuebuiL^i 21 inch Qs>rrsmL_ i 3 iLan&S\ssi sfilsisxsv sisdsuerrs^ 

^(Ijdt&jLb <oT65T&6wfld§>l/D3t]. 


rfl msoi d&rT<s$T Geuefiitf®: (3/_D/r5<s<osk/_ ss)u^^rrssr rflijeinsv ^]iud(^Ld(SurT^], iShsirsu^LDrTru 

sp([Tj GutotnijLji _10 QsusJrlLjuQldlfnd)]. iS]ssissiij'Z\ inch ^9/syrsiy 

G)a>rT6mL_ i3iLan&S]sii &Slss)&)iuna, [[22.46767241]] <oTgn//i) ld^IulS]ss)Ssi 

QsuerrlLju(B)^^]S]jD^]. 
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£g)/ 5<5 «u6is)rjui_d>$dl&v ^ssxssr^^iu nsnsdi &S]sufjthJ3,^n,d(^ ) Lb ld^^IuSIsv 90 (J>r5ij(J>g;rT® 
^<5n<5rT£E>6&)£!,d ansmsvrrLC). £g)0/(3si/ hyperplane <o7w/ry ^s&iLfidauuQLb.. ^^msua^ 
£g)/5<5<95 0<s/7(p^/7<oOT algorithm-OTT l/^sv). ^)«jf)si/ 0 LD iShd^rr&Slssi si5)/_!_L^^y<s0 

£g)/ 5 < 5 <S (3<®/7L_^_(o5)<o37 <o5)S1/<50/^<5/7(o57 6)S)G$)6ti6$) UJd <S<o5tffl<S0/_O. f_//figj(5V/<g<gf7<o87 

(3<s/7L_t^./D0i£ 2_smss)LDiurrssr Limsrfl S)S\siifjdj3ysn ^s^LDrB^isnsn ^i—^tBljD^Lflemi—Giuj 

90 ^/ry ^6V)i_GhsiJ6i?l <g£](rT)Lju6$){f,d arrem&vnLb. £g)/ 5 <g £g) &r>i_Q<susrfl (3 uj residuals 
tSysvGVd)] training error <orssTQj < 3 ] 6 $)Lpda,iju($)Lb. £g)/E /0 / 5 / 7 /i) <s«ran_ ^^rrrrsm^^lsv, 

21 inch <si5)l_z_z_o G)&>rr<mTL_ iSlid&msfilsir sfilssiGV 24 z _ nsvfj <oT<oST r5L£>d(3j sjinas ktCSgu 

Q^flu-iid). ^ssmeb £g)«n<g(3uj [bld^j Model Q^nsmf^i &><s$$f)d(& ) Lt>Gun3)i sfilesiGV 

22 i_rT<svrj <oT6trd amd®<su6&)g,d anemsvrrLD. ^jssi^Quj generalization error / 
risk <57sirrQ]L-b a^rnsurf. ^j^nsu^j Qurr^iuussii^iurrai epQij l ifl^esisv 

^GftLDtb&jdQ&rrGm®, <jy<o50 ss)su^^j answsfluu^rrsv tyrouQid error <o7w/pv Qurr^sn. 

Residual sum of squares <otsstu§] ^)/s^ risk-®£><* &6md§>)L- 2 _^siyzi 90 
function ^ 0 ^. £g)<sa>,$<3iuloss function / cost function cnwg}/ ^rysurj- 
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Residuals sum. of squares <ormu^] ^\d,^ss)^uj ^LpuiSissi 

3ismi_ir^rB^j g^rrym. g)/5<5 risk-<95(g <oTsstsst ^ngsmid, g)<o5)<s 

<5njuL^d s.smdQ^sn^j, 3ism(^)iS)L^Lju^] STsiru^] ujnjflu iSissTsnq^LDrr^i airrsmsvmd. 


Algorithm - Simple linear: 

/ 5 LD&)] atsssfluumsirfitO ^Lpsvid ajorrud Glaimsh^Ld ewsirurT® iShssTsnQ^LDrrru 
^QTjd^LD. (gj/GJsi/ Simple linear regression-darr ssr algorithm ld. 

y = Q + (3x 


g)£)si) tsLDg] explanatory, response variables psfiitjpg], a (intercept 
term), (3 (coefficient) sron/id ^pem® parameters ^rrsmuu^SimfDs^T. 

^I^rrsn^j g) 6 i//D«o/Dtfy/_o Gj&tjssGjd, [bld^j algorithm s>tf)Q]d QanrsrrdljD^j. g)^/<3si/ 
nBLD&)] model-(SOT risk-<s@<s atTpemij). ^j&n^d s,sm(^i lSIG£ lidi_n<sb risk-®x> 
<oTuuLpd (^smpuu^j <oT<o$TU3j] QgpflrBgjstfl® ld. (Lp^s61sv (3-<oOT ld $ u iS) stid 
3,sm(^]iShgd3, Qsnsm(B)Lb. iSimssrp g)«o <5 gsxsusb&j C 1-sst LD^luiSissxssrd s,sm(^]iS) 
G)Sii_6omi>. Variance <ormurBLd(Lps^>i—iu explanatory variable-®i> 2 _ehsn 

SBfjGH&GnrrssT&j srsnsnsns^ g)«oz_(a)6i/<syfl sid^^liurr^^lev ^s^ld[b^] sherry] srsirusm^d 

(&)rfdd(&)Lb.. [1,3,5,7,9,11.] <oT(SOTffy g) 0 <s@LD ^ 9 /< 5 «s 7 variance 0 

^/,0/i). sjGlsxT&sfl&v g)«osi/ dfjrrm <g£\6$)i_Gl6UGd}ugi_G$T^GftLDrB&jsrTGn&j.. ^^]Qsn 
[1,5,7,10,11.] <otsihrrp ^ 9 /iJ)< 5 g,(ii)£Bg, <oTsms>^tT)d,3>rtssT g)«o/_Q snsd! dpjnrrp 

^(njdt&jLC) Lud&JBtBlGV, <3]rB3, dp/DjD ^,sstss)LD srsnsnensy ^(tydQro&j srmd 3ismdS](B)sn(d^ 

variance ^^ld. Co-variance <oT<ssru§j [bld^j explanatory & response 
variables ^\psm(^iLb Q^prB^j <orsnsn<snsp g)<o 5 )/_Q(Si/<syfl Sid^^liun^^lev 
^smLDrB^jsnsYT^j <otsstuss)^d (^rSld^id.. ^GnsfiltjsmQdt&fLC) g)«oz_(3uj linear 

(oIjbitl _ pig ^jsvsmsvGhusirjpnsv, g)< 5 <o 37 ld^Iul/ 0 ^(gz_o. g)«06i/<95(OTj<s<s/7<o37 (&,£§?! priia; err 

LSissrsnq^LDrrgp. 


39 






Nlimpy library-si) 2 _<sh<sn s£ 0 £)su aemflp functions (3 ld/d<s<^/_ (^^£ 5 )/j 7 EV<s<syflss 7 

ULq. [BLDg] &)rj6H&,6$)6fTLj QurTQ^^^I <ofil 6$) /_«f) ILI ^®l?l <95(0Z_0 Q GU 6$) GV 6$) UJ & Q & UJ & G5TJD GST . 


https://aist.github.com/ 

nithvadurai87/406747e718d04a4bc339f740b5f9de62 


from sklearn.linearjnodel import LinearRegression 
import numpy as np 

x = [[6], [8], [10], [14], [18]] 

Y = [[7], [9], [13], [17.5], [18]] 
model = LinearRegression() 
model.fit(x,y) 

print ("Residual sum of squares = ",np.mean((model.predict(x)- y) ** 2)) 
print ("Variance = ",np.var([6, 8, 10, 14, 18], ddof=l)) 

print ("Co-variance = ",np.cov([6, 8, 10, 14, 18], [7, 9, 13, 17.5, 18])[0] 

[ 1 ]) 

print ("X_Mean = ",np.mean(x)) 
print ("Y_Mean = ",np.mean(y)) 
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£g)<563T QiSl/(Syftllf/_/7<95 iSiSSTSUQ^Lb LD§)lULI 3> 6YT Q 61/ SlflLJU® LD . 

Residual sum of squares = 1.7495689655172406 
Variance = 23.2 

Co-variance = 22.650000000000002 
XMean =11.2 
YMean = 12.9 

£g) si/si/fT/ry [Brnb asm® iSl ld® l /_//_/ a&r >syt aiDssiurTLiLq.eb G)urT(nj&,®ls$msb, 21 inch 

6i 5)/_!_/_/_d Glansmi. _ iSlL-arrsfilsir stflsmsv srsijsurTru 22.46 i^rrsvij srmd a ml®)® jog] 

<oTS$tus$) 0) ^j^husvmb. 

y = a+ px 
= a+ (3 (21) 

= 1.92 + (0.98*21) = 1.92 + 20.58 = 22.5 
where as, 

P = 22.65/ 23.2 = 0.98 

a= 12.9 - (0.98*11.2) = 12.9 - 10.976 = 1.92 

R- Squared Score: asmSahurra rBrnd ^06i7/7<95®i/y6Y7winodel <orsijsusYTS^ ^irrjLD 

2_smsmjnurrm ld®lil3ss)Sst ^srUd^w ^snsi^d^u Qurrq^rB®iLjsherry] <oTS$rus$)a>d 

asmd®®su(3^ R-Squared score ^/,@/_d. 
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https ://aist. aithub. com/nithvadurai8 7 / 

a39ecee72dc4a26693362Ic298e80df9 


from sklearn.linearjnodel import LinearRegression 

import numpy as np 

from numpy.linalg import inv,lstsq 

from numpy import dot, transpose 

x = [[6], [8], [10], [14], [18]] 

Y = [[7], [9], [13], [17.5], [18]] 
model = LinearRegression() 
model.fit(x,y) 

x_test = [[8], [9], [11], [16], [12]] 
y_test = [[11], [8.5], [15], [18], [11]] 
print ("Score = ", model.score(x_test, y_test)) 


R-squared score = 0.6620052929422553 


scoreQ srapyio function, ^^roanssr rj^Stisv , [bld&i] validation data- 
ssxoULi Qurr^jsurra SCOTe 

QsUSlflUU(^^^JLb LD^IULI O"(5l5]0/5^y l-susmrj <31 SKILLIL/LD. 1 STSSTU^I OVerfit" 
^gjsvrrsv, a/Djru l-d(0) QrBQ^djdhu LD^luurra ^(r^rB^rrsv, £g)<5«r>ss7 rsmb 
sjjDrudQarrsrrsnsvrrLb. rBiii(jjj&r)i_uj model-sir ldS^ul/ 0.66 srssr 

QsusfiiuuLdQsnsna,]. Simple linear -g> si5)l_ multiple linear- si) accuracy 

(g)ss7gn/fi) <^{51 a Lima ^(Vjd^Lti. ^smdjUujBd)) arrsmCSumd. 
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9.2 Multiple Linear Regression 


Simple linear-si) 6^0 iSlLd&msfileir 6 i 5 )s!osi)u_/(rss 70 / g£IiLi_65)&,u Qurr^j^^j 

<S>]£ 5 lg;rfluu 6 is)g,d aemGi—mb. ^ssrrrsb 2 _smss)LDuS\sb sfihssxsv <j>/££)<95/$l}l/<950 < 3 )&>g$t lE&j 
gtfGuuuQLb toppings-ii s ^0 3,rrfjss^hurrs, ^QTjd^Lti. <ormQsu 6^0 iSlLd&msfilsiT gSIgskm 

^fBsir <si5)l_z_ld LDfDQjii ^i&lsvishsn toppingS-ssr <oTsmsmf\dss)3, ^dhu ^ijsmssu^iLiLb 

Qurrigjd)^ ^s^ldSI/d^i. ^j^jQurrsir^j s^ssi^jd^Lb (3z_d/d/_//_!_/_ explanatory 
variables-gu Qurr^j^^i, response variable ^s^mrB^rTsv, ^giGsu 
multiple linear regression <oTssiuuQ]Lb. ^^ro<9irr&ST &L£>6srurr($) i3ssTSUQ^LDrTQj 



Oz_D/D<g 5 S 3 k/_ <jy(3<$ ^^rrijsm^^isv explanatory variable-siy/_sw toppings-sw 
<oT<sm6$$f)d<s$)&iLiLb multiple lmear-g* s_ 0 si// 7 <g 5 ®tiys)T( 3 sY 7 '/TLO. £g) 0 1 

iShsmsuQ^LDrrru. 


https://gist.aithub.com/ 

nithvadurai87/7068c32bd4d7fccb67ccca39623f68bc 


from sklearn.linear_model import LinearRegression 
from numpy.linalg import lstsq 
import numpy as np 
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X = [[6, 2], [8, 1], [10, 0], [14, 2], [18, 0]] 

V = [[7], [9], [13], [17.5], [18]] 

model = LinearRegression() 
model.fit(x,y) 

XI = [[8, 2], [9, 0], [11, 2], [16, 2], [12, 0]] 
yl = [[11], [8.5], [15], [18], [11]] 

predictions = model.predict([[8, 2], [9, 0], [12, 0]]) 
print ("values of Predictions: ", predictions) 
print ("values of (31, (32: ",lstsq(x, y, rcond=None) [0]) 
print ("Score = ", model.score(xl, yl)) 


rfiij Gpid&rmsr GeueifUdOD: 

(3/_D/D<s<ssk/_ dlrrsoid^nm GlGustflub® /_5) 63761/0 id/ 7/ry ^gm-DU-IlL. accuracy 

<31$5lg;rflt5{5l(rijLju6ing,d sirrGmsvmb. simple linear-si) 0.66 (oT (ofiTJY) IT (SV multiple 

linear-si) 0.77 <otssi^q^uuss)^ 3 >Gu&sf]d 3 >Gi]Lb. sr/j G>u n^i ld simple linear-®r> <si5?/_ 

multiple linear-gi/j uiumut^i^^idQurr^] accuracy (g)687gn//-Q ^£ 5 ) <9570/7=95 

$l($d(£LD 


values of Predictions: [[10.0625 ] 

[10.28125] 

[13.3125 ]] 

values of El, £2: [[1.08548851] 

[0.65517241]] 

Score = 0.7701677731318468 

/ 5 / 7 /i) iS) oj 0<5 LD^luLi^err (3/_d/d<9563W7_ ^LDsirum^sv /_5)6376i/0/_o/7/ry 

QurrQ^rB^jSlsirjDSVT. $j§>l6V intercept term -^gst OL-sir ld^Iuli xl /_o/r5/ry/_o x2 

<oTg)//-D variableS-g9zi/LD Q/_//7/ry<90/ ^Gsimsu^nev, Q u ng] <su rra <s^0 

constant- ^ 3 , ^q^d^Lb. 
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10.06 = a + (1.09 * 8) + (0.66 * 2) 

= a + 8.72 + 1.32 
= a + 10.04 

10.28 = a + (1.09 * 9) + (0.66 * 0) 

= a + 9.81 + 0 
= a + 9.81 

13.31 = a + (1.09 * 12) + (0.66 * 0) 
= ex + 13.08 + 0 
= (X + 13.08 


9.3 Simple Linear Algorithm 


Simple linear repression -A^nssr &LD<sirurT® iSisirsu^LDn^] ^ssildilild. 
ssisu^^j (1,1) , (2,2) , (3,3) ioTssiild Lfsbsffl sfilsu rrtzj&iGmdm iSisirsucjhLb &>6$sf)uun6$I 

h(x) QP&VLD &>65sf)uu<s$)&, fBITLCi £g)/5i0 2_&,lTrj6mLDn3> <oT® &>§]<£ QaUVSrrQsU ULD. 


h(x) = 0o + 0\x 


^r5d,d 3,smfhjunssT^i Jjll/t-O LD/ogiiLt ^l/_/t-1 <oT<s$)iLb G^rremCRi fLpd&hjj 

parameters-^u Qurr^i^G^ ^issildSiid^i. £§) <sujf><ss> : fo (3 m (±psirssisj alpha, beta 

<oTm< 3 ] 6 is)L£&>(lia,m£i. QeueijQsu^j ld^bIu l/sitot parameters-<s@ Qsi/si/Gsi/ffy 
suss)3,uS\sb 3,smfhjn3,sn f§l3,Lp^^Liu(^suss)^ iSlsirsuQ^Lb 2_^rrfjsm^^lsb arTsm&vmii. 
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https ://aist. aithub. com/nithvadurai8 7 / 

c57acl197368249f015ed4dldba029f0 


import matplotlib.pyplot as pit 

x = [1, 2, 3] 
y = [1, 2, 3] 

pit.figure() 

pit.title( 1 Data - X and Y') 
pit.plot(x,y,'*') 
pit.xticks([0,1,2,3]) 
pit.yticks([0,1,2,3]) 
pit.show() 

def linear_regression(theta0,thetal): 
predicted_y = [] 
for i in x: 

predicted_y.append((theta0+(thetal*i))) 
pit.figure() 

pit.title('Predictions 1 ) 
pit.plot(x,predicted_y,'.') 
pit.xticks([0,1,2,3]) 
pit.yticks([0,1,2,3]) 
pit.show() 

thetaO = 1.5 
thetal = 0 

linear_regression(thetaO,thetal) 

thetaOa = 0 
thetala = 1.5 

linear_regression(thetaOa, thetala) 

thetaOb = 1 
thetalb = 0.5 

linear_regression(thetaOb,thetalb) 


(Lp^&Slsv (1,1) , (2,2) , (3,3) -&3>n6$i susis)rjui^LD suss)rnuuu(^iS)iD^i. iSlsirsurij 
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^ 5 L_z_ n~0 =1.5, _ n~ 1 =0 <oTSS)iLb (Surra,! iSlsirsucmLb ^LDsirurndL^ev Qurrcjhd^l (1. 

1.5) , (2, 1.5) , (3, 1.5) <oTSS)ILb f_0 @ U /__/ <g>(off) SYT ILj LD . 

h(l) = 1.5 + 0(1) = 1.5 

h(2) = 1.5 + 0(2) = 1.5 

h(3) = 1.5 + 0(3) = 1.5 


<3IGi]<surT(3iB §)L-i—rr-0 =0, ^l_l/ 7-1 =1.5 <oTgn/fi) G>un&,i (1. 1.5) , (2, 3) , (3, 

4.5) <oTSS)ILb LD&lUL-l&SlSVSmLILb. &>6$)L _ d>L-L _ /7"0 =1, ^/_!_/_/7"l =0.5 67g0/fj) 

(3/_//7gi7 (1, 1.5) , (2, 2) , (3, 2.5) (oTgO/f-O f-Q@/-)/-/<Sgff)<o)Ttl//i) ^<sJHUU6&)&d 

airrsmsvrTLD. 


h(l) = 0 + 1.5(1) = 1.5 h(l) = 1 + 0.5(1) = 1.5 

h(2) = 0 + 1.5(2) = 3 h(2) = 1 + 0.5(2) = 2 

h(3) = 0 + 1.5(3) = 4.5 h(3) = 1 + 0.5(3) = 2.5 


£g)(oi/61//77ry LD^IUL/Sil^d^ SUSS)IJUI_thl3,Sn SUSS)[JUJlju(^]QsST(DSST. 

^)«nsi/ iSlsirsuQ^LDrTru ^ss)iDiL\Lb. 
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1 2 
Predictions 


Data - X and Y 


Predictions 

♦ 

j - 


♦ 

2 - 




• • • 

* 

1 ■ 


-1 - 1 - H 

0 - 

- 1 - 1 - r - ^ 


l 2 

Predictions 


(3z_0/D<95<oSWZ_ 3 <95OTsfl/_}/_/<95syfisi) <oT&)6$T gUSttfluLI 2_SmSS)LLUJrrSST /_d£ 5)L7/_/<95(«0<950 ^0®<si> 
g_ snsnQg,rr gianLUJ ld^Iul/s> smmGuj pmd ^gu^liuggi &6ttfluiSliB(& ) 

<oT®&>a)]dQ3;n&rT<srTGvrTLb. 1,1) , (2,2) , (3,3) <oTgn/fi> ro@u/_/95(g7ia0 (1. 1.5) 

, (2, 2) , (3, 2.5) <oTg)//-Q LDdiuLi&<5h&rniT)i ^icmdlsv sndg,isbsng,i. <otssiQsu 
= 1, $j/J_z_/T-1 =0.5 <oTSS)iLb ld^IljL/ suspend Q&rrsmL- answsfluums^mQiu prod (J>g,ijGii 

Q&iudj] QairTehQsumd. 


£g)/E/0 QgU^JLL 3 &)[JGll&6h LDL_(J)i_D £g)0/j Uf£lT6b , <oT<J50 <o5)6100/<95 <S<oStff?d5<5/7(SlL> 

3i SlStifl UiSlJT)(&)LD 2—<odsT SS) LLUJ ITSST LD^ILIlS)jD(&)L£i fTSST (5SI/ fQ] U fT (J) c^/D/ry 0<S5)/D61//T<* £g)0<*0Z_O 

<5Tm rfjLLLCin&v gr&vuLDrrgid g^/D (Lpi^iuLb. ^smrrsv rff^^^lsv ^udfjds.smdQsb g,rj<siig;<sh 
^iq^d^LbQurr^i, ^jpg, Qsuryurridm.ss)ssrd gismi_rSl[B^l g^jD 2 _<g 6 iy /-0 IrjQm COSt 
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function 


Cost Function: ^LDGSTurr® iSlssrsuq^LDrr^j. 


J 


1 

2 in 


m 0 

E (M* (<) ) - » (0 ) 


J = cost function 

m — Qz_D/7<g<g ^ijs^aisrrlsiT (oismsmfld&ng; 
i — Q/jd/7^<s<5 gj/jsy&srrlGV s^suQsurrssTfDrr^d Q&svsv 2 _<g<si//_o. . 
h(x) — S>SStsf]ds>UU(^lS}^IjD LD^IULI 

y — toT^lrfunrjdShsirfD ^sms^LDiunm ld^Iuli 

GiLDjD&smL- <jy(3<5 L-imsrrl S)S\suijthJ3,ss)sn iSl6$TGU([T,Lb ^LDsirumdL^sv QurrQ^^^l, prod 
Q^rjrBQ^(^)^^]^]shen gusvsfl U u rrsir dlrfliu ^gysyra/ COSt QsuruumdL^sv)m 
QsuerrlLju(^)^^]S]fD^rT <oT&srd airrsmsi^Lb. £g)<5 i /D<s/7w rffijsv iShsirsuq^LDrT^j. 

https ://gist. github. com/ 

nithyadurai87/86bd4ec2288d0e9afl38a30a7af44a09 

Qsuerrliid(^K 

cost when theta0=1.5 thetal = 0 : 0.4583333333333333 
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cost when theta0=0 thetal = 1.5 : 0.5833333333333333 
cost when theta0=l thetal = 0.5 : 0.08333333333333333 


rff&(LpLC) G)S}{£Lb: 


(1, 1.5) , (2, 1.5) , (3, 1.5) vs (1, 1) , {2, 2) , (3, 3) ($ili_/r0 

= 1.5, 8)lL, — it- 1 =0) 

J = 1/2*3 [(1.5-1)**2 + (1.5-2)**2 + (1.5-3)**2] = 1/6 [0.25 + 0.25 
+ 2.25] = 2.75 


(1, 1.5), (2, 3), (3,4.5) vs (1, 1) , (2, 2) , (3, 3) (0Hi_/r0 

= 0, ^LLff-1 =1.5) 


J = 1/2*3 [(1.5-1)**2 + (3-2)**2 + (4.5-3)**2] = 1/6 [0.25 + 1 + 
2.25] = 3.50 


(1, 1.5), (2, 2), (3, 2.5) vs (1, 1) , (2, 2) , (3, 3) 0 

= 1, §)L-Li _ n~ 1 =0.5) 

J = 1/2*3 [(1.5-1 )**2 + (2-2)**2 + (2.5-3)**2] = 1/6 [0.25 + 0 + 
0.25] = 0.50 


^jSilsb (SsuQjurr(^]3,ss)snd g^L^Lq. ^^sir &rjrrg : rfl6mLid s.smQliSlLq.uu^ssT ^Lpevil) 

s^sijQsurTQ^ gisissfluumsir rBi—^^]LC> &&ssfl ul/ld STr^g, ^errsiq (Ssuruumq.6V sq ss)iDiL\Lb 

<oJsstu£$)&)■& g^jn (LpLq.iLfLb.^id,^ss)g,uj GsijrruurTQ&srflsiT wi_ti@g;6n g;6m@iS)Lq-dg;LjuL-® 
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<3]GS)GU 2-^SV SI/(g<95<95L7L/(5)SI/<5ff)<95/7(537 3,rrpGSSTLD GTGSTGSTQGUGsflGO, &TjrT&lflGS)Ujd 

g,sm($)i3isi-&(& ) Lb(3urr& > i gj^ttgu^j GpQT) ot^/j/jdiso/d gtgsst 0^<5/7si)■%!./_ 

&rLL-i_Lju®Gijg,jD(& ) u^Igdtts, 3,Lf\d3,uuLdQ] GfilQid. gigstCSgu^ttg-st tm,^^lrjLb 
^j^lQunssiQj ^ss)LDda^uuLd(^isn<sn^i. £g)ji/(3si/ Slim, of Squares error GTGsrrrp 
<3]G$)Lpdg,Lju($)Lb. 


TBTTLD GJTD&GStQgU & 6m®lS)Lq-t5g, Sj/_!_Z_f 7-0 = 1 , ^L_L/ 7-1 = 0.5 LD§)I LI LI&GYT 
Q&ttgsstl- aiswsfluurrQm (^gstjdtb^ ^errsy COSt-g> GlGUGduu(d)^§jGUGS)^d 
s>nsmsonLb ( 0 . 50 ). gtgstGgu &,rj6n&>6fil6$i GTGmGssfJdGS)& QucmldlGsrnGVi ld . £g)/ 5 <g 

(&,£§>! TJT5$5l GST QpGVLD QGUrUUmdLSj.GS)GST TBTTLD £TrGV LI LD TT&>d gTGSSrddU_TTGV TTLD. 


£g)/5<5 0.50 GTGS)Iit) (361/ffy UTT® &(DtQ] ^^l3U-D GTGSTd 3i(T^^lGSTTTGV, ^d,GS)GST TBTTLD (s£]GSTGS)l LD 
(&)Gti)fDdg; usvGiGurry ^Ldi^rrdan^d^ ^dG&rr&i&nGxrGmu ^Iq^ldu^ ^Iq^ldu Gl&iudj] <3]$bIgv 
(&)G5)jr>GU itgst (3 gu Try u rr ($) gtjtqli( j))d>d)]id djidi—Trd&GsiGn aiGm® iSlupds, 2_<g6i/6i/(3<5 

Gradient descent ^/,@/_d. < 3 pbjd(&, (lptbgSIgo uGbQGurp $BiLi_nd&>G?riGST 

LD§)lLIG5)UIL-ILD, ^GUjf)d3iTTGST COSt~^IL/LD G£(Tlj GUGS)IJUL LD IT3i GUGS)TJ[B^J U TTTjuQ U TTLD. 

g)^y(3si7 contour plots <%,(§u>. 


Contour plots: 


^L_/_/7-0 , S>L-L _ TT~ 1 Wjnrj)]Lb ^GUGfilTJGSSTLpGST <31LpUUGS>l_u5lGV gTGSSTI—jfllUUULLl— cost 

LD^BlUL/ ^LL^LpGSTGSljniLILt) (LpLJUjflLDTTGSST GUGS)[JUL _ LDTT3> GUGS)[jpg] <95/7L_/_ 2_<561/61/(3<5 

contour GUGSITJUI—LD ^/,0/i). £g)d£/ QgSSTGSST GULSpG)dQGVTT ^GVGV^J (SI/LL GUL)-G>d(]>GVTT 
iSlGSTGUQ^LDrrru ^(Vjd^Lb. LjGYTGifl GsiGu^^jGrTGn ^i_djg;Gi?lGV gtgvgvttld COSt ^(r^ddlfo^j 

GTGST GS)GU^^]d Q&rTGSSTL—TTGV, ^GUJDGSlJD GTGVGV TTLD GS) GSST LJ LI d, GST ^LpGVLD QgSSTGSSTLD 

QuiTGSTJD GpQT) GULpGULD GJTDU^LD. 
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<si//2_z_z_o QurrsirfD suss)ijui_^^lsb usvQsuru @jL_i_rr LD^lund,^n,d3,rrssr COSt uevCSsur^j 

smdi—thi&GfTrrg, Qsusdiuu^QssTfDSST. <oTsstQsu sull^sw 6$)LDUJ{£6$)&)d gusmi—iflsijgjsiT 
^LpsvQmrr ^svsv^j dhsmsm^^lsir ^L^uurrai^&s)^ <31 g$)I_6u&>g$t ^psoQwrr @«o/d/5 £5 
tSymsy QsuruumdL^s^m Qsusffhju(^]^^d g^is^m ^Lti—rrd&sisxsn rsmb &6mi_i]5hu 

(Lpisj-iL/Lb. $}[f> 3, QsusisxsvssnuQuj Gradient descent Q^iuSi/d^j. 


d^daismi— rffijsSlsv -2 < si 5 ) 0 / 7 > 0 / 2 susmij §>Lti_rrdg;(@ri)d(& ) 100 (Lpes)(D ld^Iulisusyt 

wrrjBjfl mrrjnrfl ^®fildg;uuL-(j)i COSt 3ismi_f^huuu(^)d]fD^]. numpy (yxsvm 

^L_/_/7<s<®(SY5<g50 LD^iuL/aush su Lp rhi 3iuu(d) Si sirjr><^r- ^)<S5) eu uniform distribution 

(Lpsm/DuSlsV ^S^LDlLjLb. 


https://gist.github.com/ 

nithvadurai87/8cl 2037018 If5bb9ad966dc9fdd7935b 


from mpl_toolkits.mplot3d.axes3d import Axes3D 
import matplotlib.pyplot as pit 
import numpy as np 

fig, axl = pit.subplots(figsize=(8, 5), 
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subplot_kw={ 1 projection': '3d'}) 

values = 2 

r = np.linspace(-values,values,100) 

thetaO,thetal= np.meshgrid(r,r) 

originally = [1, 2, 3] 
m = len(original_y) 

predicted_y = [theta0+(thetal*l), theta0+(thetal*2), theta0+(thetal*3)] 
sum=0 

for i,j in zip(predicted_y,originally): 

sum = sum+((i-j)**2) 

J = l/(2*m)*sum 

axl.plot_wireframe(thetaO,thetal,J) 
axl.set_title("plot") 

pit.show() 


[BLDg] ■5/J‘61/<95(6)5<95<95/r<o37 GUG$)IJUI_Lb l3G$T61J([J)LDrr(Q]. 
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9.4 Gradient descent 


0«n/D^<5 <3)GfT6n Qsuruurrt^) <aqrf)u($)f£&)d a^L^iu d,idi_rrds,sdissT LD^luiS)&r>smd 
asmQiSlLZj-diQLb Gsussusvemu gradient descent Q^iudl/o^j (Lpaj&Slsv 
^Ldi^rrdan^d^ spqij (^ffluiShdi— LD^luLSlssxssrd QarrQjBg] ^^(f)S,rrssT COSt-gc><s 
^smi^rSldliD^j. iSlsirsisTrj^Ldm^luiSlsiSlq^rB^], 6£0 (§/)5)l}l5)l_z_ ^snsy s)dd]^^^lsv 

d>L-L _ nddyGiflsir ld^Iulisusyt @«o/d<s<95/_}/_//_!_(/J) ^^foairrm COSt ansmi^r^huuu^^fD^]- 

^susurrrorrg, (S^suQsurrq^ SrLppndiuSlspnh dlrSl^l d)d))d,rTg;d (^s^ro^gjd Q&rrsmGL- eurs^i 

(&)6$)(Dr5{5 <316YTG11 cost 3HSm(^)lSlL^d3iUU(d)S]fD^I- • ^]ff>JD3HT<o5T £F LD sir U IT (7)1 iSl sir <511 (Tl) LD IT IT}]. 


00 

•— 0() 

aJ-J 

Off o 

01 

:= 0i 

- / 

° 00 1 ' 
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£g)/E /0 s^suQsunqf) arLpfodluShsir (LpL^G)§l&^Lb |s/_!_/_/t -0 , ^l_/_/ 7-1 ^dhusuif)rflsir 

LD§>1 UL/giSYT spGrj (SrBU^^lsv (^emjDdauuL- Qsusmf^iLb. ^j^jQsu simultaneous 

Update <oTssruu(^iS]jD§]. demsmwrra ^trijuiSlsir, dsmsm^^lsir ^L^uu^^lssnud 
3ismi_rj^su^]Lb, si//_!_/_ z_d/t® £g) 0 /j/_S)ssr ^si/si/LLdj^ssr «r>/_ou_/< 5 <S 51 < 5<95 asmi—ffliLiLb 

Gsh i6$xsv6$)ujihGld ^[ 5£5 gradient descent Q^iiidro^i. ^j^jQsu global 
Optimum-^* ^/(oiOLuy/i) gul£1 ^(^ld. £g)< 5 / 00 ld/t/d/t® local Optimum <st<s577_/0/ 
COSt- sir ld^Iuli asiflsv Q< 5 /t/_/t®£f)uj/t <95 s?tr)(D ^(Dd&thJ&sn ^(^lilSIsst, <s£si/Osi //70 

^rod&(LpLb local optimum sTssruudid. Qun§]su /t® linear regression-®®/^ 
6UG$)ijui_d)&?iGb, local minimum stsstu^i ds^^iurr^. global optimum 

/_oz_!_(/J)(3z_o. 


Alpha (oTssiu^j $;/_!_/_/7®®s)$<s5f ld^Iuli^syt <oT[b <5 ^jysirsi/ s^ld^^^lsv (^emjDdauuL- 
Qsusm^ud <5T6iru&s)g,d (3j(]$ld(&)Lb. ^j^sir ld^Iul/ z_/5)®si//_o £f)/) 5 )u_/< 5 /t®si//_o £g) svsvmDsv, 
/_/5)®si//_o Q/_//$«_/< 5 /t®si//_o £g) sv sv rrm sv rrssr ^snsSlsv ^&vlduj Qsusm^ud. Ql// 70 /si//t® 

0.1, 0 . 01 , 0.001 STSST^J SilSSlLDlL/LD. 



GiLDjD&smL- ^Lpsirru ui_rhJ3isrrlsviLb]-siT ld^Iul/ Ljsnsrr} <ssi su^^jshsn ^i—fstslsv 
)(r^ddljDi y otsst <s5isi00/®Q®/tsit(3si//tz_o. ^uGungj alpha-ssr ld^Iuli LfiladdjflujJBna 
£g)0/5<5/Tc sv, Lierrerrl smsu^g^snsn s^siiGlsurrq^ arLpfodd^m 

ffiffl^rrad (^ssifodanjuidd global Optimum-go ^/«n/_si/*/D(g /_/5)0/s<5 QrBrjLD 

iS)Lq-d(ff)Lb. ^9/^)® c 9/SY7‘siy *Lp/r3^)*(SY5/-D C5^«f)Si//j/j(5)/_D. ^0 (&)L£[5GS)f5 dlssrssrd dlSSTSST 

^L^iurrai &r>suuu^] Qurrsv £g)dg/ /5®0/i>. ^^]Qsu l/ 5 )®si//_o <j>/£5)®/_d/t® 
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£g)0/5;$/Tsi)<9ii./_, J-sir LD^IULI global optimum.-*® l£}( 95 ^l(T^3inSS)LDu51sV <SU[F>&rTGVILC) 

3rn.i_, <j> fLqss)UJ l£\ 3> fi&nLDrrg; <ot(^i^§] ssxsuuu^rTsv, global Optimum-sotiy/i) 

< 5 rrsmLq., Qeu^j <orrhj(J>g;rT Q&GsrrQj 6)S}($)Lb. ^susurrQro <31®t53)^Lprodls,sift si) 
torrhiGlarhiGarr Q&ssirrii ssiLDiu^^lssxmd Q^ssrjD&^i—UJ^ g,<su(njLC). stsotGsl/ alpha-sw 
LD§)IuiSlsvxoST &rfl uj nsw < 31 6YT6)5)sv Q&rrQdg, Qeusm^ud. 

Alpha-*® <3i®d)3)] spsijQsurTQij §)lLl _ ns)S]ssT partial derivative 

3^smQ]iSiin.ds,uuQ]Q(D§t]. Simple linear regression- si) ^l!_/_/7-0 stsotl®/ 

fBssflujna, £g)0<*(®z_o. |s/_!_/_/7-l stsotl/®/ X-siy/_sw Q^rfrB^j ^)0<s®z_d( h(x) — ^l!_/_/7-0 

+ SjL-L _ n~ 1 x ). partial derivative -d&rrssr ^LDsirurr^iLb £D<3<5 (LpsnjDuSlsv 

l3mSU(f^LC[TQ] ^SS)LDlLjLD. 



rflij&filGV £5Ldi_rrO ld^lvl/ rffs^sviurrai S5) si/<95*/_)/_//_!_(£), -sir ld^Iul/ 

ldlL(^iL b gradient descent (Lpem/ouSlsv (^s^pod^uu^iQpo^i. 50 arLpfodiaush 
n,i_d,(f,uu(i)\&)6$i[D6$i. /_®sws577rj_hiStory <oTgn//-i> list-si) spdjQsurrcm 3rLpp!d)u5 }svild 
atsmi—nfihuuuLlL- COSt (3<*/_/5)<9595/jz_//_!_(/J) (^ss)jDsurrm ld^Iul/ G><5/tsi/ 

Q^iuiuuui^Sip^j. £g)$ 5 )si) partial derivative-^ssrg/delta srapy/i) u^^nsv 

(gjrSld&ijuQ&lfD&ti. $)g)<dT ld^Iul/ iShsirsuq^Lb si/s5)*u5)si) ansmddU^uu^ldlfD^]. 

delta = 1/m . (h(x) - y).x 
= 1/m . (^LL/rl.x-y).x where h(x)= ^ll/tI.x 
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= 1/m . x. ^ll/tI.x - x.y 
https://gist.github.com/ 

nithvadurai87/43664cacd625e7c290c8812894dca659 


X = [1, 2, 3] 
y = [1, 2, 3] 

m = len(y) 

thetaG = 1 
thetal = 1.5 
alpha =0.01 

def cost_function(theta0,thetal): 

predicted_y = [theta0+(thetal*l), theta0+(thetal*2), theta0+(thetal*3)] 
sum=0 

for i,j in zip(predicted_y,y): 

sum = sum+((i-j)**2) 

J = l/(2*m)*sum 
return (J) 

def gradientDescent(x, y, thetal, alpha): 

J_history = [] 
for i in range(50): 

for i,j in zip(x,y): 

delta=l/m*(i*i*thetal-i*j); 
thetal=thetal-alpha*delta; 

J_history.append(cost_function(theta0,thetal)) 
print (min(J_history)) 


gradientDescent(x, y, thetal, alpha) 


Qeu eifUrf®: 

0.5995100694321308 
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Gradient deSCent-si) */£//) £)« sit 5Tsjsuensii £g) 0<®<95 Qeusm^ud) 5rs$id 

Q&rrQbtB&j, (&)6v>jDG>Jrrm COSt-go (Surrey Q&iuujsvrTLb. Qldsvild <3](h)d&>(h)d& 

Qg,m_rfd&hLirTm Sriprodl^sdisb spGirj ld rr^lffi uj rrm COSt /_o £ 5)/_}/_/<95 sit 

Q<susrfluu(^SlfD(o)^<5isflsv pmo global Optimum-gi 6i5)/2_(5/_/7/_o <oTsirrru 

<31lj£s5Lb. 2_3 ) rr[j6m{£g)id(& ) 300 (Lp3,6V 400 susmijudlsvrrm arLpjDdlansfrlsv] ld^ilj /_/, l£\&, 
iSi&d (&)G$)(D!T)&> ^snQsu Qsuguut^idiroGi^eisfisv (<0.001) < 31 ^] global optimum-gi 
^/<s<y)/_/5^/ <otg$tQ(d < 3 i[j£g,Lb. ^gjGsu Automatic convergence test 

<oT6$rrr)]Lb < 3 ] smLpd&uu (£)!&) jog]. 


9.5 Matrix 


usvQsuqj <oTsm3ish ^jsissfJsui^^^jd Q^svsu^j ^svsflash ST6S7/j/_/(/J)/_o. simple linear 

regression-si) spGrj 6^0 (orsmssxsm snsiifB&jd Gl&rrsm® QsuQforrq^ 5Tsm&r>smd 

astfsfl^GgjrTLC). ^esmsb ^stifhsuqjjLC) multiple linear-si) <spsirri]d(^LC) QldpdulLl^ 
STsm&m spsirrorrad Q&tjp&j G>si/Q/d/70 5rsm&s)smd ansufldanj Gu/r%|/. ^grrsij&j < 2^0 

61?L_iiJ_657 IJ <3] LPf- 6l5)si//7<9j650 LDL-($)Lb 65)61/<90/<95 Q&msm®, ^/ajsi?/J_^_S57 <o)S) 6$)6V6$)ILl d 

3,smluu3}] simple linear srssflsi), sptrfj s^idu^sir 3=^jF^31i-d, < 3 i<ss)jDS>sdlssi 
(oTsmsvsfldsing;, 5rd>g,6inm si/0/_/_o us^DLpuj^i Gurrsbrin usvQsuru 3,rnj6mf]3,6$)Gn 
<5&)<sud>a)jdQg;n<5m(d) < 3 idj&dLdLqssr sfilssvsosftujd aiswsfJuu^j multiple linear ^0 ld. 

stsstCSsi/ <3ies)^u uprfld ^rbu^rb(^ (Lpsirsmrf s^sirrud^Lb (3 athd /_//_!_/_ ^rems,es)sn srsijeurrrii 

<3ismsflsu(^uu^], <3]6ttfl 61/0 <95 <95/_}/_//_!_/_ <oT6m3,G$)6fT 6$) 6100/ <57 61/61 /FTgU asmdSdQdySTT 

Q^iusu^j Gurrsirjr) 6^0 dlev < 31 l^uu&s)l_ s)ds^ujrhJ3iss)snd arorQjd Qarrsnm Qsusm^lLd. 
<9iesdl: 

6^0 < 3 iestsf]uS]<sb 5rd,{f,s$)65i TOWS ldjd^jld Columns ^snen§] lorsiruGa, 3 ir 5 &, 
dimension 5T6muu(^)Lb. 2 rows /n/ri/ry/i) 3 columns Q<95/76ak/_A <3f&s$fl 
iSleirsuQ^LDnri] <31 &s)ldu^ld. * 3 dimensional matrix 5T<ssiuu(^iLb. £g)/50 

<3iemf\u5lsb s^snsn LD^luLi3,es)sn < 31 ^ 13 ,, A -esr dLp 6T<9065)637u_/f76i/0/ TOW LDfBrgjLb 
5rd,{f,65)6siujn6i]3)i COlumn-si) /_£>/_£> ££)/_)/_/ ^shsrr^j 5resid Q&nQda, Qsuemf^iLb. 
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2-g,rTrj6m& > g f ]d(& ) A22 <oTssru§] ^jjfossri _ n6U3}i TOW ldjd^jld ^fjsmt^rrsuaq COllimn-si) 

2—6YTGYT 5 (o7gn//j) LD&IUlSlsSXSSrd rSjtfildf&ld. 


Matrix 



2 

5 


3 

6 J (2 x 3 )matrix. 


A22 — 5 


Multiple linear otswtti/ suq^idQun^j uSissi dimension <orsiru^j 

■5/761/■*syfisw GT&msvtifldsmg; LDjnruid &6wfluug,iB(8) (orQdtB&jd Q&rrsn^Lb <3] id & dig, sir 

^dhusujDs^fou Qurrrrud,^ ^ss)unqid. ^^rrsugj. 

rows = no. of records 

columns = no. of features 

Ghsudi—ij: 

spQrj spQjj Column-«*><* Q< 95 / 7 sak/_ ^sstsf] G)<sudi_rj <oT<sir /p/ ^s&iLfidauuQLb. 
iShsmeuQ^LDrTru. Qsudi^iflsv ^shen mdluLigs^en ^ smi g <ordg>Ginmuj msiigi COW 67657/p/ 

L0L_(J)i7) Q<s/7(p^^/7si) G)Ungj]LDn&srgj]. B3 srsiru^] ^Lpsirfomsu^] rOW-si) 2 _shm 

LD §>lUUIT(o$T 38 <0T6$IU6$)&,d (3)rf)]d(&)Lb. eptTFj GlsudL _651/J 0"indeXed LDfDrULD 1~ 

indexed 67gn//i) £g )0 sussigg&rflsv (gjrfddgsomd. B3 STsiru^j 1-indexed ST&sfl&v 38- 
goiLj ld, 0 -indexed sisisfisv 47-^iLjid (^ndid^id. 


59 






Vector 

B = 

'15' 

20 



38 



.47. 

4 - dimensionvect or 

= 38 (1 — indexed) 


cSHessfl&Gfleir akLlLgu: 


pern® ^swsfJansrflsir dimension ^LDLDrrai ^(^rB^rrev ldlGQGld ^sij&SUjsm(^i 

^sssf]s>ss)sniLiLb s^iGisp ^Gs, dimension Qansmi— LDinQjDrrQTj ^svsflsmu 2_0<si//7<s<s 


(Lp Lij-lLj LD. 


If (3*2) 

* (3*2) = 3*2 

1 + 7 

8+2 

3+9 

4+10 

5 + 11 

6+12 


Matrix Addition 


'1 

2' 


' 7 

8 ' 


' 8 

10' 

3 

4 

+ 

9 

10 

= 

12 

14 

.5 

6. 


.11 

12. 


.16 

18. 


(3 x 2) (3 x 2) (3 x 2) 
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'3iGB8flaGflesr Qumdaeb: 


(Lpgsvmsugj ^sstsfluSlssr Column LDfOQjLD ^ ijsmi—msug] s>]sstsf\u5lssr TOW 
^QujGsxoU&.sfiiGST <5T6m®s$fld®r>a &LDLDrrg; ^j(grBgmsv ldl-@G>ld ^sijsfilijsm® 

<3]Gmf\&,G$)6mLiLb Qu(gddl LDjnQjnrrQTj ^smflssnu 2 _ 0 si // t < s<s (lpl^ilild. n^lgrra, Qu(gddl 
s_ 0 si/f 7 <s<s/j/j/_!_/_ ^]<sstsflu5l<ssi dimension-^ ssig], (Lpgsvmsugj ^sstsfhiSlssr TOWS 
ldjd /py l£> ^rjsmi—msugi ^essfluSIsir Columns ld^IulSIssissiu Quinp6l(gd(gLb. 

If (3*2) * (2*2) = 3*2 


( Lpgsvrrsug] ^smsfluSlsv 2_shen TOW-sir /_d^)l}/_/<ss)t ^rrsmi—rTGUgf] ^svsfluSlsv 2_shen 
COlumn-sw LD^IULi3,(s^i_dr gesflggesfliurraiu Qu(gdg>uu(diLb. iSleireisTij 
<31uQu([ijdgi6rrl&iT m^luLiaieh spsirinrrgid a^id. i_liu(§)Qg$t(dg$t. £g )sijsurrQjn ^sssflgusiflssr 
QuqijdgiGV [56is)i_Qu fry Idling]. 


(1*7) + (2*9) 

(1*8)+(2*10) = 

7+18 

8+20 

(3*7)+(4*9) 

(3*8)+(4*10) = 

21 + 36 

24+40 

(5*7)+(6*9) 

(5*8)+(6*10) = 

35+54 

40+60 


Matrix Multiplication 


1 

2 







'7 

8 ' 

3 

4 

X 

.9 

10. 

.5 

6. 





(3 x 2) (2 x 2) 


7 + 18 8 + 20 

21+36 24 + 40 

35 + 54 40 + 60 



25 

28 

— 

57 

64 


.89 

100. 


(3 x 2) 
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^essfl&sifieir transpose: 


<3js$sf]u5}<sb ^srrsrr TOWS ^/<o5)<o37^0//_o COlumnS -^<95 LDrTJDfDUU®<SuGg, <31 pa, 

^smfliiSlsir transpose <o7w/j/_/(p/_o. 


Matrix Transpose 


'1 2' 



3 4 

II 

13 5' 



2 4 6 

.5 6. 


r 

L 


essfl&efieir Inverse: 


<s £0 <3)6$rfhi5l6$T inverse srsiru^] &jdGjd anpsmLDrrm (Lpssr/ouSlsv 3smddU_uu(p)Lb. 2*2 
dimension Glainsmi^ ^sissfluSlsir inverse LSlsirsuopum^i 3>smdSu_uu(^iLb. All, 
A22 LD^luLiansrUm G)u(rtjd3,6£id(&,Lb A12, A21 LPdluLjaisrflsiT Qucthd^spid^id ssnsn 

sSl^^liurT3LDrrsm^i 1-sir dip < 3 ]&s)ld[b^] su^danjutplLb. ^air Q^m^tjddlujrrs, <s>](l>3, 
<3]SWsfluSlsV 2_6Y7W A1 1, A2 2 LD^I U L/3ierT I^LD rTfDfD L£i Gl^lU UJ U U ld(p) LD, A12, A2 1 
/_}/_/<95sit srtdirr ld&diduSIsv LDrTfDfDuuLdtpud Ql/ 0.95 ■*/_}/_/(pro. 


Matrix Inverse 



4 

7 



1 

14- 12 


7 

-3 



3.5 -2 

-1.5 1 
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Identity Matrix: 


spQrj (oTemewfldss)gu5}<sorr<s$i TOWS z_D/D/py/_o COlumnS-®£><s Qarremi— ^ewsflGuj g^jrj 

<oTG5TUU($)Lb. <S£0 50//T ^SmfluSlSST (¥p<o5XSU<Sl5)L_/_<gfg)<si> LDL_(j)ILt) 1 <o7W £g)0/7>0/ 

z_d/D/D ^ji^rhig&rflsv <oT<sb<sorrLb lj^^Iiuld <oTssi ^(rFjrF>g,n&v ^j^jQsu Identity Matrix 
<oT6$nju(j)\Lb. <s£>(/7j sissfliL/ ld, sssf)u5\sti inverse -ld QgrjrB^j Identity matrix- 

®£> 2_061//7<S0LO. 


Identity Matrix 


7 = 


1 

0 

0 


0 0 
1 0 
0 1 


^4(A _1 ) = 7 


9.6 Multiple Linear Algorithm 


s^sirrud(^Lb (3 lo/d/_//_!_/_ ^Lbgdjgsn s^ssrfDfrgg QgrfrB^] <sp([ij S)S]&^uj^ss)^d 

gt&sfieb multiple linear regression <orssruuQ]Lb. s^sijQsurTQ^ s^Lbg^Lb 

xl,x2,x3.. <oTG$rd Qgrrsmi—rTGV, ^^rogrrssr gmsirurT® iShsmsuQ^LDrrru ^ss)LDiLjLb. 
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multiple linear-si) s^si/Qsi //70 feature-<s 0 z_o $5idi_n ld^iuli anemuuQbQLD 
&,®S)f, no.of rOWS -gc>/j Qurr^j^^j LDn/orr^j. stsotOsi/ |sl _/_/7 <oT<ssiu^i toTuQurr^jib 1 
rOW-si) usvQsuru ld sit 3i&r)LDrB^]shsn 3i&s^fhurrgi ^q^b^Lb. lSIsstsstf 
31<s$$fl<s$)UJ transpose Q&ujatf 1 column-si) usvQsu^j LDtbluLia&h^ssiLDrB^jshsrr 

Q<Sl/<95i_/777<S LDrT/DJDGVrTLb. <oTSStQsU ^fTSST trUnSpOSC Q&IUILIUU /_!_/_ |5L_Z_/7 

^s^sflssnuiL/ib, features-barranX ^sissflssnuiL/LD QuQ^dSlesrrreb multiple linear- 

barrem PLCusirurT® GurB&jG&QQfD&ti. 


&LD<diumLLq-<sb ^Lbi^rrO -si/lswxO argn/LO Ljtbluj feature s^sw/ry 

G&ijbgiuuQSl/D&ii. ^)0/ snjQurr^iib 1 sTgn/;i) ld^iliss)uQuj Qujnjflqijbq^Lb. £g)/50 
L-itsliu feature-^, si) ld^IuiSJsv <oT[b a, spQTj LDrrjbir)(Lpid) < 5 jjbui—ngi. Qsi/rtjyzi) 

3ismf\g,sbissr Ql/0 <*<95 sj)/<®0 ^is^smLjrfliLiLb susmanbilsv £g)0/ G&rfb&uLnb®sherry]. 


bLpb&smL- 2_g,rTtj6m{5$5l(SV, 


800 &§] F 3>1 LS)., 2 SH6$)fD3>Gn, 15 Sl/0i_ USS)LpUJ SlSibl^SSI <ofil<SS)<SO = 3000000 


1200 &§] !J <31LC)-, 3 31<S$)(D3>6fT, 1 61/0Z_ USV)LplU <ol?L_Z_9-<o57 6l5)s5)Sl) — 2000000 


2400 &3)]!J 3]LC)-, 5 31SS)fD3ish, 5 S1/0Z_ U<oS)LpiU S^ld-Lij-SiT S)d<oS)SV — 3500000 


STgn/LO 3 ^/rsiyaffirX <orss)iib 3Js^sfluSlsv Q&rTQb&uuLdQmmm. 3isbsurrQ(D 100, 
1000, 10000, 100000 ^&)UJ LD§dULj <95SIT $;L_/_(tO , ^L_/_/t 1, $j/J_Z_/73 
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-sir m^luLiasnua $j/J_z_/7 <STgo//i) ^smfkfilsb QarTQdauuL-Qshsn&m. ^)ss)su <g£\rjsm($)Lb 

GjLDinasmL- aLossiumdu^ssr U 14 . GIuuq^^^uulL^), h(x) ^svsfJsmu ^(^suud^SlssrrDssr. 


https://gist.github.com/ 

nithvadurai87/5abf51e4b26717a3427dl5fcaca6f48f 


import matplotlib.pyplot as pit 
import numpy as np 

x = np.array([[1, 800, 2, 15],[1, 1200, 3, 1],[1, 2400, 5, 5]]) 
y = np.array([3000000,2000000,3500000]) 
theta = np.array([100, 1000, 10000, 100000]) 

predictecLy = x.dot(theta.transpose()) 
print (predicted_y) 

m = y.size 

diff = predicted_y - y 
squares = np.square(diff) 

#sum_of_squares = 5424168464 

sum_of_squares = np.sum(squares) 

cost_fn = l/(2*m)*sum_of_squares 

print (diff) 

print (squares) 

print (sum_of_squares) 

print (cost_fn) 


QgjefiidCfi: 

[2320100 1330100 2950100] 

[-679900 -669900 -549900] 

[462264010000 448766010000 302390010000] 
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1213420030000 


202236671666.66666 


aemdQdii rfiaL&m& aSl&ii: 







100 



1 

800 

2 

15' 


1000 


'2320100' 

1 

1200 

3 

1 

* 

10000 

— 

1330100 

.1 

2400 

5 

5 . 




.2950100J 






100000 




Cost function: 



simple linear-®*) 6£<gj(3<5 £g)0<s0z_o. ^ssmeb h(x) s>smiS)(^iLb (g*,&,$£\ijLb 

/_0/2_(/J)Z_0 LD trig] U®LD . 


Gradient descent: 


£g)#i/6i//_o simple linear-^* s£<g(3<5 £g)0<s0z_o. ^mnsv simple linear-si) ^ll^tO 

ldZbIuli @G$)(D&&Lju($)<oLi&)tD3>rr6$T ^msiru m^ev X srsiru^] ([§&&,rr&j. ^smrrsv £g)/570 

^LLitO -siy/_swxO (35 r /r<g5<s/j/j/_!_^_0/j/j ^rrsv, <jy«r>ss7<50/ ^ll/t m^luLisieh 
@<sviDd&uu®6)j£jD&rrm &LDsirutT(b)Lb iShsirsuq^LDrT^i epGij LDrr^nfhun^^rrssT ^(njd^LCi. 
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00 := 

00 - 



0 | := 

0 i - 



where as, 






m 


O J _ 
000 

1 

m 

£ (h(x)-n 

^ • X (for all ^ ) 


i=l 



(95&(£lfj£f5)<oir OpeoLL minimum cost aemi—ffilgeb: 

Gradient descent-ga/j uuj6mu@a>a}i<sija,fB(& ) u^\<sorra> iSlsirsutr^LD &LD6$iunL-Lq-6$i 

^Lpsvm (SrBrjL^iurrai pm!) minimum COSt-g> <^rjDu(^)^^d a^L^iu Sj7l/_/7<o5)6i/ 
aism^uSlu^dai (lpl^il/ld. ^ssineb featureS-«$7 <oT<smsssf]dss)a> ^^laiLDnai £g)0/5<5/7<sii>, 

gradient descent-^/j uiussiu^i^^isnQ^ sd/opa,^. sjQesresfisc GU-on^a, 
featureS-<95@/_D ^agi/anLiu transpose a^sm^nShs^uu^i LfiligrBa, (3/5/7 sfilrjujLb 

Qa : iLiujddx^Lq-Uja>rrd, ^g>] emmy ld . 


0 = (X T X)~ l X T y 


Feature Scaling: 


£g)/s/0 s^si/Qsi//70 feature-77) Gsusi] (Scurry ^snsfilcomssi <oT6m suiflss)aa>sffl<sb 

^9/<o5)77) /7> @>) 0 TV 7J 65)3)d d,6U6$f}daisy id. 2_<5/7/7<oJ57<30/<950 50//7^/7^ <oTG$T <oTQ){£dyd 

GaUTSfiSTI _ /7SV) ^9/<o5)Sl/ 800 (Lpa,6V 1200 su&u irudl svi id. ^ss)(Da,sdissr <orsmsmf\dss)a, <oTm 

<oT(j))d>ai]dQa;rT6mi_rTGV ^ss)su 2 (Lpa,sv 5 eus^rruSlevud urjddiLjsrTsn^]. 


<3\up — 800, 1200, 2400 
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^smfn&srr = 2, 3, 5 


£g)6i/6i//7/r}/ <s£si/Q<si/(7(y5 COlumn-si) ^sytsyt to £5) zj/_/<95 (60 to Qei/ei/Oei/ffy 676557 Gurfle&i&aeJrlsv 
)&VGVfTLDGV, ^65) 65700/TO "1 6)5)000/ + 1 61/65)/7 <^9/61)61)0/ 0 6)5)000/ 1 61/65)£7 <67637 

normalize Q^/uei/G^ feature scaling - ^rmu uQld. $]3>rD(&) 2 _ 06 i/ 6 i /(30 mean 
normalization ^ 0 to. £g) 0 /d < 95/7657 (^ 0^/770 lS) 6 W 6 i/ 0 mn /ry. 

particular value - mean of all values 


maximum - minimum 


(800 -1600)7(2400-800), (1200-1600)/(2400-800), 
(2400-1600)/(2400-800) 

= -0.5,0.25, 0.5 

<^9/ 65)0) <956)7 = (2-3.5)/(5-2), (3-3.5)/(5-2), (5-3.5)/(5-2) 

= -0.5 , -0.16, 0.5 

^ 0 / (Surrmro multiple linear-si) gradient descent-®£>/j uiussru^^^Lb^un^ 

6p<si]Q<SUrT(rTj feature-zi) 6^61/Co)61//70 <^9/6)761/ 61//Ds5)<9 : <956Yi)si) £g)0Z_/L/0/7si) plot"^/,6570/ T/5)<95 
T/5)<9><S 0/7)7®ZT/ <^9/6)761/ 61//J_/_/E/<9565)6)7 (o)00<95<95 (o)00<95<95TD/7<95 (577D U @00/i-D• 67657<361/ 

65)70/7/0^)65)657 (o) .76570) 65) Z_/T/ T/5)<9561/TD £) [JLDUU ($)LD . ^/0/Co61/ normalize 
Q&lLnUUUL-® ^9/65)65700/ 61/t1z_/6/<95(60TD 6 £ (3/7 <-9/6)76)5) 61) £g)000/761) 65)70/7/0 £ 5 ) 65) 657 

(o) 76570) 65)/_//J 61/7££)ZT//77 ^(njd^LD. 


68 




10. Pandas 


Pandas <oT<siru3,i rfiaLbarrsyd 3,frs^s>ss)sn ^i smi dl . ^svd) rBLodGa/D/DsurriQ] 
suuj-SussiLDUu^jo(^ python si/Lp/Ey@®«j 7 /D 6£0 library ^(^ib. ^j^sir ^Lpevib CSV, 
txt, json QurrssrtD usvQeu^i su Lj.su djansrilsv ^(Vjb^ib (yxsvfs 3,ijsij < 95 «n sit <oT(p^^y <s ^0 
dataframe-^* LDrrjbd)! rBLDdGajniDGurriyj ^rjs^3iss)sn^ansusiDLD^^jd Q&rrsnsn 

(Lp Lj.lLj LD. 

$)di(3j prod ufrrjds,u Gurr^ib 2 _f 5 rTtjsmJB&>lGV <s ^0 stfid-Lj-sir sbUjDusmm sbUsmGVsmu 
rflij6mu5lLjud,(D(&) 2 _<s< 5 i//_o usvGsurru 3,rrpsmf]3,^n,ib, ^j^ssruisj. rffrtsmuSldgiLjuLdL- 
<sfi)6V>6Va(QnjLD CSV Qs>rruuns> Q&nQdd&uuiLQsbsnGsi. ^j^jQsu training data 
<5J<o$r li u(j)iid. £g)«r 0 6$)sud,3)]d>3,n6$i rpnib s ^0 m.odel-®£> 2 _([])GUndauGund)G jd mb . 


(Lpdj&SlGV model-®*) 2 _ (TFj<sijnd(& ) <sijg,ir)(& ) (Lpsirsjfrrf <^b>3) training data-g> [Brnb LiflrB^i 
Qamsrr&rr GsusmQib). £g)@i)<si) <oTd,3,s$)6$T &,rjsij&>Gn ^snsnssi, 6jd,3,&s)ssr null ld^Iuli^syt 

2—SYTSYTSST, <oT6$)SuQUJSVSVmb G)S) (f) LI G$) GST S)S) 6$) 6ti 6$) IL1 UrT^ldsidd^L^UJ (Lpddhud 3iTTIJGSsfl3iSYT, 

G)3,6S)SUu5]6V6VTT3, ^jGSTSST iSljD aTTTJGSsflaGSlGfT GTGUGUrTQJ d,d(3)6l]3)], Null LD§)1 LIL-I&S&KSYT 
gtsusuttiqj [BLcd 0 QsuGssiLj-UJ LD^luLiaienrTsv amro/fl ^gstlcuu^] 

QunGSTfDsufbGSifoQujGVGvmb Pandas ^Lpevib rpnib Q^iu^j urrrjdaLjGundlGjDmb. 
^j^jQsu preprocessing / feature selection gtsstuu^ld. ^^ro^n&n rffrrsv 

iSlGSTSUQjjumiry. 


https://gist.aithub.com/ 

nithvadurai87/5fd84f40ce26eac65a8060ee2dl 5280a 
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import pandas as pd 


# data can be downloaded from the url: 

https://www.kaggle.com/vikrishnan/boston-house-prices 
df = pd.read_csv('data.csv' ) 
target='SalePrice 1 

# Understanding data 
print (df.shape) 
print (df.columns) 
print(df.head(5)) 
print(df.info()) 
print(df.describe()) 
print(df.groupby('LotShape 1 ).size()) 

# Dropping null value columns which cross the threshold 
a = df.isnull().sum() 

print (a) 

b = a[a>(0.05*len(a))] 

print (b) 

df = df.drop(b.index, axis=l) 
print (df.shape) 

# Replacing null value columns (text) with most used value 
al = df.select_dtypes(include=['object']).isnull().sum() 
print (al) 

print (al.index) 
for i in al.index: 

bl = df[i].value_counts().index.tolist() 
print (bl) 

df[i] = df[i].fillna(bl[0]) 

# Replacing null value columns (int, float) with most used value 
a2 = df.select_dtypes(include=['integer','float']).isnull().sum() 
print (a2) 

b2 = a2[a2!=0].index 
print (b2) 

df = df.fillna(df[b2].mode().to_dict(orient='records')[0]) 

# Creating new columns from existing columns 
print (df.shape) 

a3 = df['YrSold'] - df['YearBuilt'] 
b3 = df['YrSold'] - df['YearRemodAdd'] 
df['Years Before Sale'] = a3 
df['Years Since Remod'] = b3 
print (df.shape) 
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# Dropping unwanted columns 

df = df.drop(["Id", "MoSold", "SaleCondition", "SaleType", "YearBuilt", 
"YearRemodAdd"], axis=l) 
print (df.shape) 

# Dropping columns which has correlation with target less than threshold 
x = df.select_dtypes(include=['integer', 'float']).corr()[target] .abs() 
print (x) 

df=df.drop(x[x<0.4].index, axis=l) 
print (df.shape) 


# Checking for the necessary features after dropping some columns 

11 = ["PID","MS Subclass","MS Zoning","Street","Alley","Land Contour","Lot 
Config","Neighborhood","Condition 1","Condition 2","Bldg Type","House 
Style","Roof Style","Roof Matl","Exterior 1st","Exterior 2nd","Mas Vnr 
Type","Foundation","Heating","Central Air","Garage Type","Misc Feature","Sale 
Type","Sale Condition"] 

12 = [] 

for i in 11: 

if i in df.columns: 

12.append(i) 


# Getting rid of nominal columns with too many unique values 
for i in 12: 

len(df[i].unique())>10 
df=df.drop(i, axis=l) 
print (df.columns) 


df.to_csv( 1 training_data.csv',index=False) 


rffij ®)ld&rr<osr <aS\<snd&ib iLiptriiib QajeAidCill: 

CSV- SU Q—6YT<5YT djrjGi-l&GYT df <oT<s$)iLt> dataframe-agisif pandas ^lpsulo 
<ojjDjDuuiL(^isnsn^i. £g)££)si) <oT^ss)SsiYO'WS ldid rryii) Columns ^sytsyt^j <oTsstuss)^ 
iSlsirsu(TijLDnrru ^/^husvrrLb. 
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print (df.shape) 
(1460, 81) 


/_5) 65761/010 <3>L-Ll _65)6Y7 67657G)<o57657657 C0lUITlD.S 2_6Y76Y70/ 67657L/650 Q GUs/H LJ U @£3)] LD. 


print (df.columns) 

Index(['Id', 'MSSubClass', 'MSZoning', 'LotFrontage', 'LotArea', 'Street 1 , 
'Alley', 'LotShape', 'LandContour', 'Utilities', 'LotConfig', 

'LandSlope', 'Neighborhood', 'Conditionl', 'Condition2', 'BldgType', 

'HouseStyle', 'OverallQual', 'OverallCond', 'YearBuilt', 'YearRemodAdd', 
'RoofStyle', 'RoofMatl', 'Exteriorlst', 'Exterior2nd', 'MasVnrType', 

'MasVnrArea', 'ExterQual', 'ExterCond', 'Foundation', 'BsmtQual', 
'BsmtCond', 'BsmtExposure', 'BsmtFinTypel', 'BsmtFinSFl', 

'BsmtFinType2', 'BsmtFinSF2', 'BsmtUnfSF', 'TotalBsmtSF', 'Heating', 
'HeatingQC', 'CentralAir', 'Electrical', 'IstFirSF', '2ndFlrSF', 

'LowQualFinSF', 'GrLivArea', 'BsmtFullBath', 'BsmtHalfBath', 'FullBath', 
'HalfBath', 'BedroomAbvGr', 'KitchenAbvGr', 'KitchenQual', 

'TotRmsAbvGrd', 'Functional', 'Fireplaces', 'FireplaceQu', 'GarageType', 

'GarageYrBlt', 'GarageFinish', 'GarageCars', 'GarageArea', 'GarageQual', 

'GarageCond', 'PavedDrive', 'WoodDeckSF', 'OpenPorchSF', 

'EnclosedPorch', '3SsnPorch', 'ScreenPorch', 'PoolArea', 'PoolQC', 

'Fence', 'MiscFeature', 'MiscVal', 'MoSold', 'YrSold', 'SaleType', 

'SaleCondition', 'SalePrice'], 
dtype='object') 


head(5) (op<5<5iL>5 ^rjsi^3,ss)sn Qsus^tluu^i^^iLb. 


print(df.head(5)) 

Id MSSubClass MSZoning ... SaleType SaleCondition SalePrice 
0 1 60 RL ... WD Normal 208500 
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1 2 20 RL ... WD Normal 181500 

2 3 60 RL ... WD Normal 223500 

3 4 70 RL ... WD Abnorml 140000 

4 5 60 RL ... WD Normal 250000 
[5 rows x 81 columns] 


info() [bld^j dataframe-CT7 d9/«r>i£/j/_/ ujoj^hu ®S}Gurjih]&>6$xsfT Qsue/Huu^^^jLb. 


print(df.info()) 

&lt;class 'pandas.core.frame.DataFrame'&gt; 
Rangelndex: 1460 entries, 0 to 1459 
Data columns (total 81 columns): 

Id 1460 non-null int64 
MSSubClass 1460 non-null int64 


SaleCondition 1460 non-null object 

SalePrice 1460 non-null int64 

dtypes: float64(3), int64(35), object(43) 

memory usage: 924.0+ KB 

None 


describe() ev (Lpddhuu Lisnsdludiueb Gfilsurjrhig; emend memdQm^^^j 

GlSUGfilUuQtB&JLb. 


print(df.describe()) 

Id MSSubClass ... YrSold SalePrice 

count 1460.000000 1460.000000 ... 1460.000000 1460.000000 
mean 730.500000 56.897260 ... 2007.815753 180921.195890 
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std 421.610009 42.300571 ... 1.328095 79442.502883 
min 1.000000 20.000000 ... 2006.000000 34900.000000 
25% 365.750000 20.000000 ... 2007.000000 29975.000000 
50% 730.500000 50.000000 ... 2008.000000 163000.000000 
75% 1095.250000 70.000000 ... 2009.000000 214000.000000 
max 1460.000000 190.000000 ... 2010.000000 755000.000000 
[8 rows x 38 columns] 


groupbyo spQTj column-si) ^srnsn ld^Iuli^sisvsyt si/S 5 )< 95 /_}/_/(£)£>£ 5 ) Q<su<sJrlLju®j£§]Lb. 


print(df.groupby('LotShape 1 ).size()) 

LotShape 
IR1 484 
IR2 41 
IR3 10 
Reg 925 
dtype: int64 


6p<si]Q<SUrTQTj column- SV/LO 2_6Y7W mill LD§?lLJL/ <956^651 <oTSmSSS?HsS)3,SS)lU 
GIsustHuu^i^^ld. 


print (a) 

Id 0 

MSSubClass 0 
MSZoning 0 
LotFrontage 259 
LotArea 0 
Street 0 
Alley 1369 
LotShape 0 
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LandContour 0 
Utilities 0 


PoolQC 1453 
Fence 1179 
MiscFeature 1406 
MiscVal 0 
MoSold 0 
YrSold 0 
SaleType 0 
SaleCondition 0 
SalePrice 0 

Length: 81, dtype: int64 


0.05 <oT<o37L/^y Null-<s<g5/r<o3tthreshold ^i^nsu^j 100 5 null 

ld^IUL j3ieh ^(ijdgievrTLt) <5rm susftijiun^d&ijuL-QsrTsn&j. srsmCSsu ^ss)&, si5?z_ ^dja 5 

^ 9 /sytsiy null id^)/_}/_/< s syf Qansmi— Columns aemi—ffliLujuLd® 

iSlssrssuj ^jssxsu dataframe-siS^/jdg/ d,da,uu(^iS}ssitDssi. 


print (b) 

LotFrontage 259 
Alley 1369 
MasVnrType 8 
MasVnrArea 8 
BsmtQual 37 
BsmtCond 37 
BsmtExposure 38 
BsmtFinTypel 37 
BsmtFinType2 38 
FireplaceQu 690 
GarageType 81 
GarageYrBlt 81 
GarageFinish 81 
GarageQual 81 
GarageCond 81 
PoolQC 1453 
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Fence 1179 
MiscFeature 406 
dtype: int64 


GjLD/n&smL- 18 COlumnS-guiyii) jfjdSliu iSlenesnjQl eieiru^j 63-^/,<s<s 

0 ( ofi)fD [B &)l (5fT&) (oft )&)&> &> IT (oftfiT6V fTLD. 


print (df.shape) 
(1460, 63) 


Threshold-^* q5)/_<95 (&, «n/D en rr&n null /_D^)/j/_/<g5«nsYr/j Quin rumen text 
column-^<s37g/ QmeftuuLiu($\&}(D3)i. include=['object'] ereiru&j text 
COllimn-grs (^rSldt^Lb. 


print (al) 

MSZoning 0 
Street 0 
LotShape 0 


Electrical 1 
KitchenQual 0 
Functional 0 
PavedDrive 0 
SaleType 0 
SaleCondition 0 
dtype: int64 

print (al.index) 

Index(['MSZoning', 'Street 1 , 'LotShape', 'LandContour', 'Utilities', 
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'LotConfig', 'LandSlope 1 , 'Neighborhood', 'Conditionl 1 , 'Condition2', 
'BldgType', 'HouseStyle', 'RoofStyle', 'RoofMatl', 'Exteriorlst', 

'Exterior2nd', 'ExterQual', 'ExterCond', 'Foundation', 'Heating', 
'HeatingQC', 'CentralAir', 'Electrical', 'KitchenQual', 'Functional', 
'PavedDrive', 'SaleType', 'SaleCondition'], 
dtype='object') 


columns-si) 6p<si]Q<surT(nj ld^Iulild <oTd,&,6$)6$i (Lpem/n ^ji—LbQujDr^jsrrsYT^j 
(oTSSIU^l &>6mi_pShUUUlL(j)\ ^SS)SU 6p([]j LDrTjDIDUU($\QsSIjDSSI. list-iojr 

( Lpg,6vrTGiJ ldZbIuli ^ 9/^<95 ^sytsi/ ^i_ldQU fD^jSiTSn SUrTlj^GS)^ 

<g£]<sij6i]nij(B6$){f,u5]6$m<sb <5(7wnull LD^luLiansn r^rjuuuu^iSlssiiDssi. 


print (bl) 

['RL', 'RM', 'FV', 'RH', 'C (all)'] 

['Pave', 'Grvl'] 

['Reg', 'IR1 1 , 'IR2', 'IR3'] 


[ 'Y', 'N', 'P'] 

['WD 1 , 'New', 'COD', 'ConLD', 'ConLI', 'ConLw', 'CWD', 'Oth', 'Con'] 
['Normal', 'Partial', 'Abnorml', 'Family', 'Alloca', 'AdjLand'] 


Threshold-^* q5)/_<95 (&, «n/D surr&sr null ld§)I uL-i&s&iGYTLj Qu/D^jmerr 

numerical COlumn-^/,(S57^/ ^I—LdQujd rumen LD^luiSlmrreb 

r£irjuuuu®£iiDgi. include=['integer 1 ,'float'] &Tdru^j numerical 
columns-^* (& ) rSid(& ) Lb. 
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print (a2) 

Id 0 

MSSubClass 0 
LotArea 0 


MoSold 0 
YrSold 0 
SalePrice 0 
dtype: int64 

print (b2) 

Index([], dtype='object 1 ) 


COlumn-si) ^sytsyt LDtsluLi&eisvsrT spuiSlid®, ^swsu&G/Hsir 
ddd,tsiujn&Lb asmi—jflujuuLi® sptTjj l column-^® datafra.me-si) 
<^6$)6m3>3,uu(j)i&)iD3)i. 63 Columns-^/,.® ^errerr^j L-ig] Columns ^smsmrBg, iSlsir 
65 stsst LDrTjfluSltvjuusingjd airrsmsvmd. 


print (df.shape) 
(1460, 63) 

print (df.shape) 
(1460, 65) 


( 2&,6$)6i]u5}<sb<son &, sptTTj&ev COlumn-ssr Gluiurf^srr GrBrjuj-iunad QarTQd&uuLd® ^ewsu 

dataframe-si) ^)0/5^//_5)s3r59 stsw LDnj^udi^uus^^d 

arrsmsomh. 
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print (df.shape) 
(1460, 59) 


numerical columns-*®^, target columns-d^Lorrm correlation 

■s<oak/_/)5)«_//_)/_//_!_(£) QsiJsifluuQ^^uuQSdlfngi]. ^j^sir ld^iul/ 0.4 OTgn/iD 
threashold-^ q5)z_ (g^s^in surra ^(rTjuiSisir ^ss)su dataframe-<515)0/5^/ 

^dauu(^]QssrrDssr. 


print (x) 

MSSubClass 0.084284 
LotFrontage 0.351799 
LotArea 0.263843 


SalePrice 1.000000 
Years Before Sale 0.523350 
Years Since Remod 0.509079 
Name: SalePrice, dtype: float64 


Qm/DOrL/r^hu LDrrjnjnrdasYT ^ss)ssr^ajid rffaLprB^ iSim, rsLcd^^ Gig,ss)suiurrssr spqijdlsv 
(Lpddhu sdsy, lli dig,sir dataframe-si) £g)<J5rapyz_o ^srrsrr^n srssru^i 
Garr^ldauu(^)d]jD^j.^srrs^d(^ ^^lamrrm ^ssfiuuidL. LD^luL/as^sna Qarrsmi— 
Columns d,dauu(^\d}ssrrDssr. ^smsuiLiid r§dauuLGi_iSlsir Columns <oTsmsssfidss)a 
38 6TS5T LDrrp$lu5\([})LjuG$)g,d arrsmsvmd. 


print (df.shape) 
(1460, 38) 
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iSleirsKrrt ^s&xsu <oT[5Qg)[53) Columns <otssi Qsusdluu(^i^uu(^\S}ssTiDssi. 


print (df.columns) 

Index(['MSZoning', 'LotShape', 'LandContour', 'Utilities', 'LotConfig', 
'LandSlope', 'Conditionl', 'Condition2', 'BldgType 1 , 'HouseStyle 1 , 

1 OverallQual 1 , 'RoofStyle', 'RoofMatl', 'Exteriorlst', 1 Exterior2nd', 
'ExterQual', 'ExterCond', 'TotalBsmtSF', 'HeatingQC', 'CentralAir', 
'Electrical', 'lstFlrSF', 'GrLivArea', 'FullBath', 'KitchenQual', 

'TotRmsAbvGrd', 'Functional', 'Fireplaces', 'GarageCars', 'GarageArea', 
'PavedDrive', 'SalePrice', 'Years Before Sale', 'Years Since Remod'], 
dtype='object') 


-J 


dataframe-<si> ^\(^d(^Lb LD^iuLi^snrTssi^i training data 
<oTSS)iLb Quiuflev .CSV Garruuna; G&LSld&uuQfi&lGsitDGSi. ^j^jGsu model- sir 
2-(njGurrdg;j5{5lfn(& ) g-snstfi—rra, ^ssildil/lc). model-®*) ^(r^sundi^su^j 

<5tuul q. <5Tsirru u^^ludev aimsmsvmd. 
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11. Model file handling 


11.1 Model Creation 

sklearn (sk for scikit) <otssiu§i python-si) sbsn ^jiun^IrrsuLfld arnmspid&rrsKT 
90 library ^0 ld . £^$51 <sv classification, regression ^Siuj suss)s, 3 ,sfiissr @iLp 
<3i<5v>LDiLiLb linear, ensemble, neural networks Qumsirro s)S]^L£>rrssr 

model-<s 0 z_o algorithms s,rrsmuuQ)Lb. ^)^)si 5 ) 0 ^^y LinearRegression 
<oT<ss)iLb algorithm-^* <ot(^^^j r5Lb(Lp<5&>i_uj data-^oj/J uin/fl rBrnb < 95 /D/ry<g 

g,(i]j!&](]iir)rTLb. £g)<5/D<95/7w rflijGV iSlmsuQTfLDrrrrij. 


https://gist.github.com/ 

nithvadurai87/91e74160ccb4ff51eef3188372a78b91 


import pandas as pd 

from sklearn.linear_model import LinearRegression 

from sklearn.model_selection import train_test_split,cross_val_score 

from sklearn.externals import joblib 

from sklearn.metrics import mean_squared_error 

import matplotlib.pyplot as pit 

from math import sqrt 

import os 

df = pd.read_csv('./training_data.csv') 

i = list(df.columns.values) 
i .pop(i.index('SalePrice')) 
df0 = df[i+['SalePrice']] 

df = df0.select_dtypes(include=['integerfloat 1 ]) 
print (df.columns) 
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X = df[list(df.columns)[:-1]] 
y = df['SalePrice'] 

X_train, X_test, y_train, y_test = train_test_split(X, y) 
regressor = LinearRegression() 
regressor.fit(X_train, y_train) 

y_predictions = regressor.predict(X_test) 

meanSquaredError=mean_squared_error(y_test, y_predictions) 
rootMeanSquaredError = sqrt(meanSquaredError) 

print("Number of predictionslen(y_predictions)) 
print("Mean Squared Error:", meanSquaredError) 
print("Root Mean Squared Error:", rootMeanSquaredError) 
print ("Scoring:",regressor.score(X_test, y_test)) 

pit.plot(y_predictions,y_test,'r.') 
pit.plot(y_predictions,y_predictions,'k-') 
pit.title( 1 Parity Plot - Linear Regression') 
pit.show() 

plot = pit.scatter(y_predictions, (y_predictions - y_test), c='b') 
pit.hlines(y=0, xmin= 100000, xmax=400000) 
pit.title( 1 Residual Plot - Linear Regression') 
pit.show() 

joblib.dump(regressor, './salepricemodel.pkl') 


/^<7gD/<sd5/rc8r QeyefiidCfi: 


Index(['OverallQual', 'TotalBsmtSF', 'IstFirSF', 'GrLivArea', 'FullBath', 
'TotRmsAbvGrd', 'Fireplaces', 'GarageCars', 'GarageArea', 

'Years Before Sale', 'Years Since Remod', 'SalePrice'], 
dtype='object') 

Number of predictions: 365 
Mean Squared Error: 981297922.7884247 
Root Mean Squared Error: 31325.675136993053 
Scoring: 0.818899237738355 
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rflrreoidarTesr aSlsirdanh: 


1 .trainincj_da.tcl sissiild QainuiSlfo^sh 2_<sy7w ^6S)65id,3)]d, ^/7siy<95(g5LD df-<95@sYy 

Q&GVld&UUL-ChlsfilL 


- 651 . 
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2. 3Gssfluu3jr)T3f g_g)Si/fii ^ssissrdgtiLj X-gy/fii. 3Gssfld3UUL_ 

(SsusmL^iu 'SdlePrice' gtgstu3,i y-eviLD G&Lffid&LjuLdCh)6rT6rT&,i. $J{5JD(&) 

(LpG5TG5T[} pOp() GTGSTU3)J 3G$tifld3U U L_ (SsUSmL^lU COlumil-gl df-<Sl5)0/50/ /£<*££) 

iSlGsrGsrrf LSsmQi-b <95<s5)z_£) column-^/,.® ^sstsmddljD^j. $d5<dr ^Lpevid [ :-l] 

<OT<S37<95 Q<95/7(5)<9>0/ <95(S3)Z_£)<950 (LpGSTGSTTTGV 2_GTTGTT <3] GSIGSTT53)] LD X-gJ/LD <95<S5)/_£) 

column-^/,<S37 'SalePrice 1 -^y-sjyiii Q3i£]^^id Q33 gytgytsv3ld. 


3.fit() <oT65TU3)1 &>£)(£]d Q^IT®UU3,fn(3)LD, prediCt() GTG$TU3)J 

3<o$$flU u3)jf)(3)Lt} u uj gst /_/ (J) ® JD&jJ • 


4.SC0re() gtgstu^j tbld^j algorithm, gtgijgugytgii ^it[jld 3 iflujTT3d 

3rDgudGl3rrsm(^snsn3 l i gtgstugs>3, ld^IuiSU^u uujGsruQdlfDg)]. 


5.train_test_split() gtgstu^j tbld(lpgs)i_uj Qld3^^ < 5 / 7 siy< 95 <s 5 xsY 7 75% - 25% 
<oTg3)//-Q sfiHdlfs^tslGV iSlrfldS>J/D3)j. <3]3,ttgij3)] 75% <5/761/<95 <syt 3jf)rud 

Q3rT®UU3,jn(3 ) Lb, 25% <5/761/<95<SY7 G <977<5<S51<S57 GI31U3J LD^lLlidU^GU^ (f)(3) id 
uujGSTuQfid. 


6 .mean squared error, sqrt functions, /std^/ algorithm-^,<si) 

3&lStifld3UU®Lb LD^IULj3^fT,d(3)Lb 2-GS5T G$) ID UJ TT GST LD§fLI /_/<95 (§3)3(3)id 2-GYTGYT 
^jLpUlSlGST 3rj33lflGS)Ujd 3GSSTI_(f)}3)3)1 3^3)113). ^LpUL-l 3>(TGST 'ReSidlial 

Error' ^070. £g)9i/ 6£0 61/<S51/77_/Z_7D/795 GUGS)TJTB3)1 <95/7L_Z_l}l//J_(5)<SY7<SY70/. 


7.joblib GTGSTU3)] [5LD3)j model-gi .pkl Q 33 UU 33 Q3LS]d(3)Lb. ^j^jCSgu pickle 
file ^/, 0 td. ^) 0 y serialization mro^iid de-serialization -<950 2 _< 5 si/®<s 57 /d 

Gp(JT) binary (3.95ffUL/ 61/<S5)95 <^07D. £g)(S5)<5 <S5)61/<50/ <0761/617/7/7}/ L/Sfuj <5 £761/95 (S3) 6Y7 
3GSsfluLJ3)J GTGST <3](f)T53, U (3)§f iff 6V 95/7<S557<SU/770. 
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11.2 


Prediction 


/5/_o^y QairruiSlsv 2_ehen (Lp^sv a, ijs)S]ss)Sst z_d/_!_(/J)z_o QarrQ^gjj ^^irjaiam ®S)g$)(Mg$)UJ 
aswUaaff QaasvsviQsuaLCi. £g)dg/ input.jSOIL <orgn/io GamjiShsir su l£)G>uj 
Q <s rr (J) d a, u u ® ® jog ]. 


cat input.json 

{ 

"OverallQual":[7], 
"TotalBsmtSF":[856], 
"IstFirSF":[856], 

"GrLivArea":[1710], 
"FullBath":[2], 
"TotRmsAbvGrd":[8], 
"Fireplaces":[0], 
"GarageCars":[2], 
"GarageArea":[548], 
"Years Before Sale":[5], 
"Years Since Remod":[5] 

> 


predictQ QaiL]<sua,iDa>[T<s$T rffrjsv iSlsirsuq^LDrr^]. 


https://gist.github.com/ 

nithyadurai87/4a31b465220448ab05b84d2227e4e8a5 


import os 
import json 
import pandas as pd 
import numpy 

from sklearn.externals import joblib 
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s = pd.reacLjson('./input.json') 
p = joblib.load("./salepricemodel.pkl") 
r = p.predict(s) 

print (str(r)) 


rfirrsvid&imssr GeueifiidOi): 


[213357.65598157] 


2—Sssrss)U)UjnssT ScllePrice ld^Iuli 208500 <ors$flev [bld^j rffrjev 213357 <oTgn//-Q 

ld^IulSIssxsst QGUsrfluuQtBgjjLb. £S)/J_/_<gj<5L_/_ uusurruSlevs^sv. ^jQssrssflsb [bld^] 

algorithm-^ score, 81% ^(gio. ctcttOsi/ stilus?/tun #ld 

^jQ^dai^nsiT GIpiuilild. 

rglpspidamsisT aSlGfrd&ih'. 


1 .joblib.load() <oresiu^] binary ia/i^6i^^<sY7w(3<95/77_}/_5)«n«s7de-serialize 
Q&iugl algorithm-^/,* LDrrrorSl G&iSldigLb. 


2. z_5)<o37«s7/i l8§>i Q&ujsbuQLb predict!) ^<o37g/json gul^s^Igv 

3)IJ<51Jdi<o5)&YT 2—GYTG1?L _ IT3id Q&rTChldft] ^^rAirrr^cT 7T)o.,o^..P. ■ , n n^c 


2-GYTGfT 


<3i0)id)3) prediction-csa/TCTT 2_<syraf(p LDjBrryLD QsusftluSiL® ld^IuiS]ss)Ssi OTsi/si/mry 
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s£0 Rest API-^.95 expose Q^iueu^i <oT 657 yry urrijdaisvrTLti. 


11.3 Flask API 


[BLDgi algorithm. a,sssf]d<^Lb uiSi s&iskt eptrfj API- ^g; expose Q^tusi/ Flask 

uiusiruQdling]. ^3,r)g;rTm rffijev iShsmsu^LDrT^j. 


https://gist.github.com/ 

nithyadurai87/9d04097e006e2fe6c7a96blda643cb3a 


import os 
import json 
import pandas as pd 
import numpy 

from flask import Flask, render_template, request, jsonify 
from pandas.io.json import json_normalize 
from sklearn.externals import joblib 

app = Flask(_name_) 

port = int(os.getenv('PORT', 5500)) 

@app.route('/') 
def home(): 

return render_template('index.html') 

@app.route('/api/salepricemodel', methods=[ 1 POST']) 
def salepricemodel(): 

if request.method == 'POST': 
try: 

post_data = request.get_json() 
json_data = json.dumps(post_data) 
s = pd.read_json(json_data) 
p = joblib.load(/salepricemodel.pkl") 
r = p.predict(s) 
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return str(r) 

except Exception as e: 
return (e) 

if _name_ == '_main_' : 

app.run(host='0.0.0.0', port=port, debug=True) 


dlireoidsmssr Qeueifhdfh): 


* Serving Flask app "flaskapi" (lazy loading) 

* Environment: production 

WARNING: Do not use the development server in a production 
environment. 

Use a production WSGI server instead. 

* Debug mode: on 

* Restarting with stat 

* Debugger is active! 

* Debugger PIN: 690-746-333 

* Running on http://0.0.0.0:5500/ (Press CTRL+C to quit) 


'^3)6$)65T pOStman <oTgn//-0 <*0615) ^LpSVLC) pm!) (ol3>ITSYTSYT6VITLCi . 
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POST 

▼ http://localhost:5500/api/salepricemodel 

& form-data ® x-www-form-urlencoded • raw • binary 

1 * 

It 

"OverallQual" : [7], 

"TotalBsmtSF": [856], 

4 


"IstFIrSF": [856], 

5 


"GrLivArea": [1710], 

6 


"FullBath": [2], 

7 


"TotRmsAbvGrd" :[8], 

8 


"Fireplaces" :[0], 

9 


"GarageCars" :[2], 

10 


"GarageArea" : [548], 

11 


"Years Before Sale": [5], 

12 

13 

> 

"Years Since Remod":[5] 

Body C 


es Headers (4) Test Results 

Pretty 


Raw Preview HTML ▼ rp 

i 1 [213357.65598157] 



11.4 Model comparison 


n bld^j model 2_0si//7<95<g5^^)ff50 Qsi/ffyii) linear regression-®*) z_ol_(/J)lo 
uuj6iru®j££5rTLDGV, Qsurn &gv algorithm-si/z_65y id epuiSlLi® <oTgi ^jnrB^Q^rr 
uujgstu ($){£&) (3susm(^iLt). ^j^jD3irTsm rffrjev iShsireuQ^LDrTrii. pLD&j ^fjsi^3,es)sn 

usvQsuru algorithm-si) G)urT(rTj&>§?l, ffl6i/Qsi/f7ss7/)5)g)/sinmj Score LDjDjQjLD RMSE 

LD $§)/_}/_/<9565)6)7 Q6USlflLJU JD^I• ^SUrDfflsV £)/D/5 <5 65) <5 /5/710 (3<g/761/ Q&UJ&J 

QarTmm&vrTLti. 
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https://aist.github.com/ 

nithvadurai87/9ecfcbf04593d245e26316d52b0708el 


import pandas as pd 

from sklearn.linear_model import LinearRegression, Ridge, Lasso, ElasticNet 
from sklearn.ensemble import RandomForestRegressor, AdaBoostRegressor, 
ExtraTreesRegressor, GradientBoostingRegressor 
from sklearn.tree import DecisionTreeRegressor 
from sklearn.neural_network import MLPRegressor 

from sklearn.model_selection import train_test_split,cross_val_score 

from sklearn.externals import joblib 

from sklearn.metrics import mean_squared_error 

from azure.storage.blob import BlockBlobService 

import matplotlib.pyplot as pit 

from math import sqrt 

import numpy as np 

import os 

df = pd.read_csv('./training_data.csv') 

i = list(df.columns.values) 
i.pop(i.index('SalePrice')) 
df0 = df[i+[ 1 SalePrice']] 

df = df0.select_dtypes(include=['integer', 1 float']) 

X = df[list(df.columns)[:-1]] 
y = df['SalePrice'] 

X_train, X_test, y_train, y_test = train_test_split(X, y) 
def linear(): 

regressor = LinearRegression() 
regressor.fit(X_train, y_train) 
y_predictions = regressor.predict(X_test) 

return (regressor.score(X_test, y_test),sqrt(mean_squared_error(y_test, 
y_predictions))) 

def ridge(): 

regressor = Ridge(alpha=.3, normalize=True) 
regressor.fit(X_train, y_train) 
y_predictions = regressor.predict(X_test) 

return (regressor.score(X_test, y_test),sqrt(mean_squared_error(y_test, 
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y_predictions))) 
def lasso(): 

regressor = Lasso(alpha=0.00009, normalize=True) 
regressor.fit(X_train, y_train) 
y_predictions = regressor.predict(X_test) 

return (regressor.score(X_test, y_test),sqrt(mean_squared_error(y_test, 
y_predictions))) 

def elasticnet(): 

regressor = ElasticNet(alpha=l, ll_ratio=0.5, normalize=False) 
regressor.fit(X_train, y_train) 
y_predictions = regressor.predict(X_test) 

return (regressor.score(X_test, y_test),sqrt(mean_squared_error(y_test, 
y_predictions))) 

def randomforest(): 
regressor = 

RandomForestRegressor(n_estimators=15,min_samples_split=15,criterion='mse 1 ,max 
_depth=None) 

regressor.fit(X_train, y_train) 
y_predictions = regressor.predict(X_test) 

print("Selected Features for RamdomForest",regressor.feature_importances_) 
return (regressor.score(X_test, y_test),sqrt(mean_squared_error(y_test, 
y_predictions))) 

def perceptron(): 

regressor = MLPRegressor(hidden_layer_sizes=(50O0, ), activation='relu', 
solver='adam', max_iter=1000) 

regressor.fit(X_train, y_train) 

y_predictions = regressor.predict(X_test) 

print("Co-efficients of Perceptron",regressor.coefs_) 

return (regressor.score(X_test, y_test),sqrt(mean_squared_error(y_test, 
y_predictions))) 

def decisiontree(): 

regressor = DecisionT reeRegressor(min_samples_split=30,max_depth=None) 
regressor.fit(X_train, y_train) 
y_predictions = regressor.predict(X_test) 
print("Selected Features for 
DecisionTrees",regressor.feature_importances_) 

return (regressor.score(X_test, y_test),sqrt(mean_squared_error(y_test, 
y_predictions))) 

def adaboost(): 

regressor = AdaBoostRegressor(random_state=8, 
loss='exponential 1 ).fit(X_train, y_train) 
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regressor.fit(X_train, y_train) 
y_predictions = regressor.predict(X_test) 

print("Selected Features for Adaboost",regressor.feature_importances_) 
return (regressor.score(X_test, y_test),sqrt(mean_squared_error(y_test, 
y_predictions))) 

def extratrees(): 

regressor = ExtraTreesRegressor(n_estimators=50).fit(X_train, y_train) 
regressor.fit(X_train, y_train) 
y_predictions = regressor.predict(X_test) 

print("Selected Features for Extratrees",regressor.feature_importances_) 
return (regressor.score(X_test, y_test),sqrt(mean_squared_error(y_test, 
y_predictions))) 

def gradientboosting(): 

regressor = GradientBoostingRegressor(loss= 1 Is 1 , n_estimators=500, 
min_samples_split=15).fit(X_train, y_train) 
regressor.fit(X_train, y_train) 
y_predictions = regressor.predict(X_test) 
print("Selected Features for 
Gradientboosting",regressor.feature_importances_) 

return (regressor.score(X_test, y_test),sqrt(mean_squared_error(y_test, 
y_predictions))) 

print ("Score, RMSE values") 

print ("Linear = ",linear()) 

print ("Ridge = ",ridge()) 

print ("Lasso = ",lasso()) 

print ("ElasticNet = ",elasticnet()) 

print ("RandomForest = ",randomforest()) 

print ("Perceptron = ",perceptron()) 

print ("DecisionTree = ",decisiontree()) 

print ("AdaBoost = ",adaboost()) 

print ("ExtraTrees = ",extratrees()) 

print ("GradientBoosting = ",gradientboosting()) 
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iftirepidarresr Qeusifluf(h).: 


Score, RMSE values 

Linear = (0.7437086925668539, 40067.32048747698) 

Ridge = (0.7426559924644496, 40149.523137601194) 

Lasso = (0.7437086997392647, 40067.31992682729) 

ElasticNet = (0.7427716507607811, 40140.499909601196) 
RandomForest = (0.7816174352942802, 36985.57224959144) 
Perceptron = (0.7090884723574984, 42687.80529374248) 
DecisionTree = (0.7205230305007451, 41840.45264436496) 
AdaBoost = (0.7405881117926998, 40310.51057481991) 
ExtraTrees = (0.8112271823246542, 34386.90514804029) 
GradientBoosting = (0.770865727419495, 37885.095662535474) 

Selected Features for RamdomForest [0.61070268 0.04279095 
0.04336447 0.17066371 0.01107406 0.01329107 
0.0065515 0.03938371 0.02458596 0.02051551 0.01707638] 

Selected Features for DecisionTrees [0.75618387 0.03596786 
0.02304119 0.13037245 0.0022674 0. 0.00739768 0.01056845 
0.01184136 0.01171254 0.01064719] 

Selected Features for Adaboost [0.38413232 0.18988447 
0.03844386 0.12826885 0.03857277 0.03995005 
0.01059839 0.08066205 0.05036717 0.01473333 0.02438674] 

Selected Features for Extratrees [0.33168574 0.04675749 
0.05913052 0.11159271 0.05178125 0.02947481 
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0.03966461 0.16786223 0.06241882 0.05316226 0.04646956] 


Selected Features for Gradientboosting [0.04426232 0.16359645 
0.14768597 0.25403034 0.02119119 0.04361512 
0.01825781 0.01626673 0.15891844 0.07188963 0.06028599] 


Co-efficients of Perceptron [array([[ 2.835196506-01, 
7.33024272e-03, 2.80373628e-01, ..., -1.43939606e-03, - 
3.84913926e-02], 

[ 1.34495184e-01, 1.31687141e-02, 1.72078666e- 

04.1.70666499e-23, -2.31494718e-02, -1.08758545e-02L 

[ 9.44490485e-02, -2.34835375e-02, 2.37798999e-02.- 

1.74549692e-02, -2.70192753e-02, -3.67706290e-02], 

• • • t 

[ 1.59527225e-01, -3.19744701e-02, -1.22884400e-01.- 

2.35994429e-26, -3.03880584e-02, -2.85251050e-02], 
[-3.63149939e-01, -4.05674884e-02, 2.66679331e-01, ..., - 
1.73628910e-02, 7.40224353e-03, -6.89871249e-03], 

[-4.30743882e-01, 7.07948777e-03, 3.34518179e-01.- 

1.740751116-02, 3.47755293e-02, -2.64627071e-02]]), 

array([[ 0.16789784],[-0.01864141],[ 0.20432696]. 

[ 0.01739125],[-0.02779454],[-0.00476935]])] 


model-®) £g)®/(3isi/ <oTssid Score ldjd^jld 

RMSE ld^iuli Threshold Limit, Sensitivity Qurrmrosuros^rou-iLb 

pmb msmddlsv Glmnerrsn (Ssusmt^ud. <g£\G$)&>ij ujnifliLiLti QldQsv (^fr^uiSUd^lehen 
6p<si]Q<surT(rrj algorithm-®)zj ujoir^iLiLb iSlsirsKrri r 5 mb sSlsnd&LCirr&d mnsmsomb. GldG&v 
(^ fr^uiSlLd^lshsn algorithms-su sptnjd)svewsij STrBQ^rB^ features-®) «osi 
mswsfl^^jsirsn^j stsstuss)^ G]susdluu(^i^^liLisnsn^i. ^ssrrnsb ^jrB^u usmq linear, 
ridge, lasso, elasticnet (SunsirjD<SiJjDir^jD(^d Slsmi^iurr^j. ^/,<s(3isi i ^3)] (3 unssr[D 
algorithms-*® RFE technigue qpgvlL n,mb features-®) (3^/jsiy 
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^svuuu GsusmQLb. <g£\ 6 $)&,u ufo/fl 'feature Selection' 67 go;fi) u^^JuSlsb 

amsmsvrTLCi. 


11.5 Improving Model score 


rBrnb 2_0<si//7<95®uj models score-^,<5973/ /j 5 )< 9561 //_o 0«n/£)<5i//r<95 ^j(r^dd]fD^j <oT<ssf)<sb, 
sygi <oTFf> 3 ) ^ji_£b&>i 6 v ^^laub Qsu^ju^idifD^j <oTssid asmi—dthu trend / parity 

Gumsirir) suss)fjui_thjs,ss)smj GumdQu urrrjda, GsusmQidi. dLpd&smL- 2 _^rrrjsmd,^lsb 

s^0 si?/_!_^_<oOT 6)dl6 $)&vgs)uj rfltjGmuSluudjjDgirTm UGvQeuru ^Lb&thJ&^Lb, 

&) gist is)-U u g$) l _ uSlsv r£l rrsmuSldaiu u /_!_/_ gS\(dugs)Gst <oS\GS)(Ms,^n,Lb uuSl/oSdd^d 

Qs>rr(j)\d 3 >LjuiL(j)\ 6 n 6 n 6 $i. £g)<o5)<s ssigu^^j pnid ^(f^sunddhu model-<o5r SCOre 
35 <oT6 37 <oUd>&j]6YT6YT&j]. GTSStQgU GTfB^, ^l_d>$£lGV 2_Gm6$)LDUJrr65T <sS\ (S3 )GVILf Z_0 , 
3>GStsf]d3>UU(^lLb GlSlG$)<S0ILjLD ^j^3iLC) Q<SUri]uQldl/Dg)] <oTGSld aGmi—lfyuj trend, parity 
plots Gues)ijujLjuiL(^iGnGnGsi. 


fiireb LDiprpiid <^n&<ssr QeijeiiUrfCi}): 


https ://gist. github. com/nithyadurai8 7 / 

ca54a4a8f59187cb988b5145d000c70c 


import pandas as pd 

from sklearn.linear_model import LinearRegression 

from sklearn.model_selection import train_test_split,cross_val_score 

from sklearn.externals import joblib 

from sklearn.metrics import mean_squared_error 

import matplotlib.pyplot as pit 

from math import sqrt 

import os 
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df = pd.read_csv('./training_data.csv') 

X = df[list(df.columns)[:-1]] 
y = df['SalePrice'] 

X_train, X_test, y_train, y_test = train_test_split(X, y) 
regressor = LinearRegression() 
regressor.fit(X_train, y_train) 

y_predictions = regressor.predict(X_test) 

meanSquaredError=mean_squared_error(y_test, y_predictions) 
rootMeanSquaredError = sqrt(meanSquaredError) 

print("Number of predictions:",len(y_predictions)) 
print("Mean Squared Error:", meanSquaredError) 
print("Root Mean Squared Error:", rootMeanSquaredError) 
print ("Scoring:", regressor.score(X_test, y_test)) 

## TREND PLOT 

y_test25 = y_test[:35] 

y_predictions25 = y_predictions[:35] 

myrange = [i for i in range(l,36)] 

fig = pit.figure() 

ax = fig.add_subplot(111) 

ax.grid() 

pit.plot(myrange,y_test25, marker='o') 

pit.plot(myrange,y_predictions25, marker='o') 

pit.title('Trend between Actual and Predicted - 35 samples') 

ax.set_xlabel("No. of Data Points") 

ax.set_ylabel("Values- SalePrice") 

pit.legend(['Actual points','Predicted values']) 

pit.savefig('TrendActualvsPredicted.png',dpi=100) 

pit.show() 


## PARITY PLOT 

y_testp = y_test[:]+50000 

y_testm = y_test[:]-50000 

fig = pit.figure() 

ax = fig.add_subplot(111) 

ax.grid() 

pit.plot(y_test,y_predictions, ' r . ') 

pit.plot(y_test,y_test,'k-', color = 'green') 

pit.plot(y_test,y_testp,color = 'blue') 

pit.plot(y_test,y_testm, color = 'blue') 

pit.title('Parity Plot') 

ax.set_xlabel("Actual Values") 
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ax.set_ylabel("Predicted Values") 

pit.legend(['Actual vs Predicted points'Actual value line','Threshold of 
50000']) 
pit.show() 

## Data Distribution 
fig = pit.figure() 

plt.plot([i for i in range(l,1461)],y,'r.') 
pit.title('Data Distribution') 
pit.show() 

a, b = 0 , 0 
for i in range(0,1460): 
if(y[i]>250000): 

a += 1 
else: 

b +=1 

print(a, b) 

#X = X[:600] 

#y = y[:600] 


Trend plot sisiru^j gsmsmuunsKr sfilsvHSvaqiinjLi) modeT<95«wfl<5;5 &S\ssxsos>(§fT)Lb <oT[Bg, 

<316YT61jd(& ) (^{B^UJrr&LjuQQGSTfDGST (oTG$TUG$)&>d &,rrL_($)Q(D3)l. 
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values- baieKrice 


Figure 1 


<- -> + Q S \eL IS) 


Trend between Actual and Predicted - 35 samples 


300000 - 


250000 


200000 - 


150000 


100000 - 


Actual points 
Predicted values 



10 15 20 

No. of Data Points 


25 


30 


35 


Parity plot <oTssru§] <j>// 5<5 s£0 th.reshold.-gc> nLDdSl/o^i. 

<jy<5/r6L/0/ 6 i5)<o5)<su QsuQjum^rrssT^j 50 ^uSlrjLb suss)fj (Lpmss)iLb iShsir^jm 
Q&svGVGvml) <oT 6 $i 3 , O<s/r(5)<50/ threshold-.s^syr CTSusi/syrsiy <sfil<s$)<so&> 6 n 
^GftLDrB&jsnsfTGsr, < 3 i 3 ,iB(&) Qldsv OTsi/si/SYrsiy sherry] <oTsiru 6 ing,d 
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sy®3)3>i-jLq-ujrr3> data distribution chart sussujiuuuLL^sbsn^i. £g)<5«f7X- 
^dddsv uu5ljn&d(&) < 3 i®rrldgiLjuL-® 6 rT 6 n 1460 r 0 WS-/_ 0 , Y-^dd)sv siSIidussisst 

Sl5)«fXSU<95(SY5/_D SUSSltJUI—LDITO, SUSS)tjp&j <®/7L_/_L7 ULdQsYTSYTSST. $)$SV (Lp^SU 600 
reCOrds-sw«n /7 sttpussissi sS)ss)GOd,sk 1 svid^^lsShr^rB^j 5 svlL&ld sussirj urrsuson&u 
urrsidiL/snsn^ssi^d &>rrss5TsonLt>. Qldsv 6OO-si 5)0/50/ 1000 records -sussirj 

sfil/DUSSlSST sfilssisvash <3ISS)SSTJ53t]Lb QsurrULD 2 SVlY^^lQsvQlU ^/^)<95LD 

^ss>LDr5§?l([TjLjuss)g,d <s nsmsvmi). £g)0/(3si/ model-<s57 (^ssijnrB^ ^syreiy SCOre-<s0<* 

3>it usssTid. uudljod) ^srrJd&uu ®ld {Bfjsy&snrrssT&j drjrrssr (Lpssijoudlsv urjsusvrra; 
^sstLDrB^lq^ds, Qsuss$r($iLb stsst sjid&sstQgu < s < o 5 w ( 3 /_/ 7 z _ o . £|)/ ev 0 ^susurrgjj ggj&vs&sv. 
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< oTmQsU <oT&j]6U61S)IJ dljn3iLj U Jstilly GUSTTGig,rT <3]3)J6U6$)IJ LDL_|J)/_D 2 _<SY 7 W ^fJS1^3,GS)Snd 

Q<95/7(p^^y model-g> s_0si//7<a0i_DG/j/7^y ^>/<5<oOT SCOre ^i^\^ifluuss)^d ^rrsmsomb. X 
= df[list(df.columns)[:-l]] , y = df['SalePrice'] <oTSwd Qg;rT®d>g, iSlsirsmij, 

X = X[:600] , y = y[:600] <oTgn//-Q suifls>ssxsn (^je&Ksmd&nsv (Suna,iLDnsifra,i. (Lp&,sv 
600 records sus&irj ldlLQld ^sytsyt &,[jGn&6$)6fT <oT(j)\d,3)i [bld^j model 

0 si/ rrd <s u u ® id. 



Output: 


101 
















Number of predictions: 365 


Mean Squared Error: 2312162517.277571 
Root Mean Squared Error: 48084.95104788578 
Scoring: 0.34729555622354125 
97 1363 


&sisu_!dhung; uuSlrodldt^d Gl&rrQd&LjuL-Qsrrsn 1460 gpsii&sifl&v 250000-<a@/_D 

G>ld 6V srsususnsy LD^luL/aish 2 _shsnm, < 3 ]d,jD(&)d dip STSususnsy /_d £ 5 )/_}/_/< 956)7 ssnsnssr 

srsiru^j &6m®iS)Lpd&uuLL(d)6rT6rTg J ]. ^{51 si) 1363 /_d £ 5)/_}/_/<95 6)7 250000-950 d(Lgid, 
Glsur^jid 97 /_d £ 5)/_}/_/95 6)7 ^ < 5/0 0 Qld^jlc> <31 sv>ld finish sum. srsmQsu ^njsi^id dtjrra, 

£g)sv&susv. {g)&i/(3eu Outliers <oT6Kruu®§>lfng)]. ^j^jQunsirjD OUtlierS-®£> <57 si] sun fry 
3 >s&&t(])) iSlLpd>3j] tEd^sugj] sim ^@^ 53 , u 0 ^ 5 ) u51 sv mnsmevnid. 
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12. Feature Selection 


Qs>nui3ss)isn usvGsurni columns ^jq^ddliDGl^sisflsv, ^su/nnysYT sttbQ^tb^ 
column LD§>lUL-l<3>6V)<5YTU Qun^l^§] T5TTLC) 3,S$sf]d&}SSTJD filS )SipUJLD ^SSlLDSlfD^I STSSTd 

g,sm(hi)i3i^uuG{f, feature selection ^(^ld. 2 -& } rrTjsmd,g ) ]d(& ) 400, 500 
C 0 lumns-g><s Q&nsmQsbsYT QainuiSlsSlq^rB^j, prediction-*^ 2 _ 0 <siy ld spq^&sv 
(Lpddiiu columns-^ Q^rjsi^ Q*ujsi/ 0 /feature selection ^^ld. 

(Lp£®S]<sv rBLDLSU—ofiGherr columns-^* process variables, manipulated 
variables & disturbance variables st^jld 3 sussi&uSim dip iSifid* 
Qsu<sm(^iLb. si) manipulated ldtd^ld disturbance ^rjemQLb input-**/ 7 S 37 
parameter-^,*si/ii), process stswl/ 0 /output-**/ 7 S 57 parameter-^/ ) *< 5 i//_D 

<;g)/S5)/_O®/£)0/. 


• Manipulated Variables (MV) - ^sirsus&iaudsir dLp ^ssildilild 
COlumnS-si) ^sytsyt ld^Iuli^5S)Syt rBLDLDrrsv LDrTjDffl ^ssiLDda, 

(LpLij-iLjLd. pLDdQa,(D(D<oLirr(r)] £g)0S5)S37 tbttld sm&iurrmsvrTLb. 

• Disturbance Variables (DV) - £g) 0 S 5 )S 5 r rBLDLDrrsv Qr^rjispiurrs, LDrrrDrSl 
^6$)ij}d&> (LpL^iurr^j. ^ggtttsv manipulated-sil LD^luiSlssissru QunrQ]d,G&, 

^3)657 LD^IULj <3 /S5)Z_O®/£)0/. 


• Process Variales (PV) - usvQsuqj Q&iusvippsisifBasisiSYTU Qurr^i^^i 

dl<SVI<5YT<5YT LDidlULjaSYT ^6&)LDlLjLD. <J>/0/7<51/0/ LDJDJD COlumnS LD^IUL/3i6S)5YTLj 
QuTTgU^Qp $j3)<dT LD^IULI <3]6$)LD&>!(D3)1. STSStQsU $j§>ISV TBITLb LDTT/DIT)JSU3,/D(&> 
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<o70/Z_D ®«O/_UJ(70/ . 


QjLDiB&smi—Gunrri] iSlrfl^g, iSlsir s^shQsunq^ variable-<* 0 /_o ldjdjd variables-6i/i_<SOT 
£g) 0 <s 0 z_o Q<$/7L/r/_5)(o!n(o37<g5 3,smddu_<sb QsusmQhih. ^j^jQsu correlation 
<oTmuu®Lb. ^g,&ir ld^bIuli-1 615)0/50/ +1 sussxj ^miDiLjih. -1 srsiru^j srtslijLDmfDtB 
Qg,m_rjmuiLiLh, +1 (3/5/ro65)/d 0 (o)<$ m^rjs^uiLj ld 0/?5)<950 ld. 


2_g,rTtj6mJ5g J jd(& ) '' ^sh rsmi d 2_smshi)siT ^msy", ''2 -I_jduu51jo& Q&iuiLid QrBrjd", 

" <oTSS)i _ (3)<mfouLi Li^3irhi3i<msYTLj UL^d/gd QtT>rjih" (SurrssrjD dlev uev features-®? 

msu^^j, "^_/_si5]«s 7 <oT«n/_" srmid spm shUstyiudmg, rnrrd 3,ssd\ds,ij Qurrsu^rraid 
Qarrsmt—rrsv Correlation matrix sv ^sytsyt LD §)l Li Li & gyt iSlsinsuQTjLDnrriJ 

^mLDiLiih. 


• Positive Correlation: 2_z_6i5)6S7 simi_d(&)d - ^sm smi d ^smsdlssr 

<3iGnsnd(&)LDns5T Qg,m_rji-i +1 srsm QsusrHuuQhd. ssmshhesr ^snsy 

<3l$5lg;rfll5d,rT6V <oT«n/_ ^^laifld^d. 

• Negative Correlation: 2_/_6 i5)6ot srmi—d^ih- 2_i_tDuu5i(Ddi Q&iinud 
QrB[7^^lfD(^LDrTm G)g,m_rji-i -1 srm QsusrfluuQhd. 2_i_ihuu51fh!d Q&iuu-iih Grsrjih 
<31$5lg;!fljhg,rT6V 2 _i_gS)s$t <oTS$)L_ 0 mroiLjih. 


• Zero Correlation: ototil 0 oti/dlvl/ ujhd)hu Lid,&,3,da,s$)snij UL^di^d 

Gnjij£^i—<s$T QarrsssrQerrerT QgrrL-iji-i 0 stsvt QsusrfluLKjhd. uis^d^d 
QrBfj^^lfod^ih 2_/_<si5)<55t <oTS$)i_d(&,ih ujrrQ^rrq^ &dLD[5g,(Lpd £g)&vm&v 


^)msuujsvsvn<s Gsi irry dlev features ^qjjuiSlsir ^msu Q&nsmQhshsn 

Glibrri—ijiSimssTij Qurrru^^i, ^^ro^rrssr ld^lvl/- 1 6i5)0/50/ 1 sumrr ^mLDiL/d. 
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12.1 Highly Correlated features (MV - DV) 


&L£da<5mi_ 2 _£ 5 nrj 6 mJ 5 §?lGV, data.CSV gtg^ild G&mjiSlrrx&Gh ^snsn COlumnS-isi? gt&,i 
<ot^j <otsstQsstsstsst su uj rTm parameters GrsstiLb si5)si//7<s«f)<5 rBrnb domain 
expert-<SOT Q&rrGm® QgffirBg] QamsrTmGvrTLb. 2_g ) rr[jGm{5& t id(3 ) A (Lp^sbZ gugidij 

Quiuijsieh Giarrsmi— 26 features-su A,B,C,D,E,F ^jyShus^xsu process 
parameters ^ans^iL, wpinGVGu manipulated ld/d^/ld disturbance 
parameters 3^Q^^nnGnGGnmi>. gtgstGgu (Lpg,<s6i<sv process parameters 

c gV«n<SOT< 5 ^yLD dataframe- 6 tS) 0 / 5 <g/ d,ds>uu(^>S]Gsi[DG5i. iSlGsrsifrrf lS^Iiligytgyt 
manipulated lo / d^ld disturbance parameters ~ <EF><EF> IT (ofiT correlation 

a,6m($)iSlisi-d3,LjuL-($i, <^ 0 / GairruLi gul^gSJ sg]id , gugsiijul- GUL^G^sviid 
QGU6ifluu®^£UUL-(l)i6rTGng 1 i. ^gu/dj^gv <3]§)lg; Q^,ijldgs)(d LD/orrum gt§)I ijLCiGiftjDd) 

Qg,m_rjLi QanemQsYTGfTG&KSU dataframe-<st5)0/50/ fid&uuQd&lGiriDGSi. <jy0/7<si/0/ -98,- 
99,-1,98,99,1 gtgsiilc) Q^z 7 z_/rz_ 5 )<o 5 )<o 37 /j G)ujDifltrFjdiQLC) ^)0 features-si) epGsrrry 
/£<s<sLiz_/( 5 )®/D 0 /. ^)aj 6 i/mD/ 7 <® manipulated LDfdgu 1 -^ disturbance-<s®«r>z_u5)<si> 

/££)<*0 G)ff>nL _/7L/ Q&nsmQsnGn<3]Lb&(h]&,6rT 3,Gm($)iSlLsi-dg>ijuL-(i)i ^Gujorfisv s^Gsrru 

$dguju®§ilrD§]. i£^(LfiGnGn^ gs)gst^^ild trainingdata gtg5)il£> Quiuflsv 
G&LSldgnjuQ&rDg]. £g)0/(3si/ [ bld^j process variable-<950z_o, G>0/j/5Q<5(5)<s<95/j/_//_!_/_ 

manipulated & disturbance variable-<s0i£f7«$7 Q^m-jiSieneard 

&G&STL _ d)lGUg)jf)(d) 2-GYTGt?L _Z7<9> ^ G$) LD dlJD&J . ^\GS)GU3,Gn ^GS)GST^^]ld [5md &&Ssfld& 

Ggugsstu\.uj process variable- siy/_«$7 QainsmQiGnGn Qjzrri—ijiSlG&iGKrd asm® iSl Lq-d>§], 
<3jS>l6V 0 Q<g/7z_/jz_/ Ou/D/pysyrar columns-^* rSdi^Gu^j uuj-iuna ^gs)ldS)pt)^i. 

https://gist.github.com/ 

nithvadurai87/5a43155d33cf5288204def23661704d0 
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import pandas as pd 

import matplotlib.pyplot as pit 

import numpy 

from sklearn.linear_model import LinearRegression 

from sklearn.model_selection import train_test_split,cross_val_score 
from sklearn.metrics import mean_squared_error 
from math import sqrt 

from sklearn.feature_selection import RFE 
from sklearn.datasets import make_friedmanl 

df = pd.read_csv(/data.csv') 


# Dropping all process parameters 
df = df.drop(["A","B", "C", "D", "E", 


"F"], axis=l) 


#finding correlation between manipulated & disturbance variables 
correlations = df.corr() 
correlations = correlations.round(2) 

correlations.to_csv('MV_DV_correlation.csv',index=False) 
fig = pit.figure() 
g = fig.add_subplot(111) 

cax = g.matshow(correlations, vmin=-l, vmax=l) 

fig.colorbar(cax) 

ticks = numpy.arange(0,20,1) 

g.set_xticks(ticks) 

g.set_yticks(ticks) 

g.set_xticklabels(list(df.columns)) 

g.set_yticklabels(list(df.columns)) 

pit.savefig('MV_DV_correlation.png') 


#removing parameters with high correlation 

upper = correlations.where(numpy.triu(numpy.ones(correlations.shape), 

k=l).astype(numpy.bool)) 

cols_to_drop = [] 

for i in upper.columns: 

if (any(upper[i] == -1) or any(upper[i] == -0.98) or any(upper[i] == - 
0.99) or any(upper[i] == 0.98) or any(upper[i] == 0.99) or any(upper[i] == 
1 )): 

cols_to_drop.append(i) 
df = df.drop(cols_to_drop, axis=l) 

print (df.shape,df.columns) 

df.to_csv('./training_data.csv',index=False) 
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itHirepidarresr QeuGifUrfCh) 


(20, 17) Index(['G', 'H', 'K\ 'M 1 , 'N', 'P', 'Q', 'R', 

'W, 'X', 'Y', 'Z'], dtype='object') 


'S', 'T 1 , 'U 1 , 'V', 
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0.89 

0.83 

- 0.27 

0.42 

- 0.72 

- 0.29 

0.96 

1 

0.46 

- 0.57 

- 0.56 

- 0.17 

0.68 

0.89 0.63 

- 0.27 

- 0.44 

0.78 

0.76 

0.38 

0.62 

0.75 

- 0.6 

- 0.32 

- 0.78 

- 0.03 

0.51 

0.46 

1 

- 0.1 

- 0.21 

0.26 

0.15 

0.36 - 0.05 

- 0.4 

0.33 

- 0.6 

- 0.61 

- 0.48 

- 0.48 

- 0.62 

0.59 

0.02 

0.6 

0.34 

- 0.55 

- 0.57 

- 0.1 

1 

0.86 

0.1 

- 0.38 

- 0.35 - 0.11 

0.66 

0.39 

- 0.68 

- 0.67 

- 0.38 

- 0.54 

- 0.69 

0.61 

0.14 

0.68 

0.5 

- 0.53 

- 0.56 

- 0.21 

0.86 

1 

0.23 

- 0.24 

- 0.39 - 0.14 

0.67 

- 0.46 

- 0.07 

- 0.03 

0.12 

- 0.35 

- 0.12 

- 0.23 

- 0.11 

0.07 

0.9 

0.01 

- 0.17 

0.26 

0.1 

0.23 

1 

0.29 

- 0.48 - 0.54 

- 0.35 

- 0.08 

0.22 

0.29 

0.77 

0.33 

0.36 

- 0.07 

0.71 

- 0.22 

0.26 

0.8 

0.68 

0.15 

- 0.38 

- 0.24 

0.29 

1 

0.44 0.51 

- 0.24 

0.2 

0.58 

0.61 

0.71 

0.92 

0.71 

0.02 

0.48 

- 0.58 

- 0.49 

0.79 

0.89 

0.36 

- 0.35 

- 0.39 

- 0.48 

0.44 

1 0.8 

0.09 

0.57 

0.06 

0.09 

0.45 

0.52 

0.24 

0.44 

0.82 

- 0.06 

- 0.37 

0.57 

0.63 

- 0.05 

- 0.11 

- 0.14 

- 0.54 

0.51 

0.8 1 

0.42 

0.76 

- 0.64 

- 0.63 

- 0.33 

- 0.22 

- 0.56 

0.91 

0.45 

0.64 

0.04 

- 0.37 

- 0.27 

- 0.4 

0.66 

0.67 

- 0.35 

- 0.24 

0.09 0.42 

1 


12.2 Zero Correlated features (PV - MV,DV) 


"A" (oTsiru^i prr ld QsusmL^iu process parameter <3Tsmd Q^rrefrQsumL. 

trainingdata <ot^uld G&rruiSirbigjsn, ^\d>3) "A" -s&su & column-^* 

^ssxsmtB&j dLpd&smi— dlnsoidfg, 2 _<s)T<syfz_/r<* ^gn/UL/oy/i). iSlsirsisTrf A-d^id ldjdid 
parameters-«0i£/7«j7 Q^m^rfiShs^smd 3ism(B)iSh^^^], ^1§>Isv 0 Q< 5 /rz_/r/_/ 
Q&nsmQmm M\^ DV -ss)iu /£)<95®<si5)z_<si//_o. $)di(3j 0.6 -agtii (^s^ifosumsm ^^rrsu^j 

0.1, 0.2, 0.3, 0.4, 0.5 <oTSS)ILb LD ££)zjz_/<95 6$)<SfTU QUJDfQJ&h&n COlUIMlS 

/£ <55 <95 LJ U ® Si SSTfD SST. 


https ://aist. aithub. com/nithvadurai8 7 / 

e0cca6ec864405a032888244122a90d8 


import pandas as pd 

import matplotlib.pyplot as pit 

import numpy 

from sklearn.linear_model import LinearRegression 

from sklearn.model_selection import train_test_split,cross_val_score 
from sklearn.metrics import mean_squared_error 
from math import sqrt 

from sklearn.feature_selection import RFE 
from sklearn.datasets import make_friedmanl 
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df = pd.read_csv('./training_data.csv') 
print (df.shape,df.columns) 

# Dropping columns which has correlation with target less than threshold 
target = "A" 

correlations = df.corr()[target].abs() 
correlations = correlations.round(2) 

correlations.to_csv('./PV_MVDV_correlation.csv',index=False) 
df=df.drop(correlations[correlations<0.06].index, axis=l) 

print (df.shape,df.columns) 

df.to_csv('./training.csv',index=False) 


rfirrevidamssr QeueiflidCh): 


(20, 18) Index(['G', 'H', 'J', 'K', 'M', 'N', 
'W', 'X 1 , 'Y 1 , 'Z', 'A'], dtype='object') 
(20, 17) Index(['G', 'H', 'J', 'K 1 , 'M', 'N', 
'W', 'X', 'Y', 'A'], dtype='object 1 ) 


■P', 'Q', 'R 1 , 'S', 'T', 'U', 'V', 
'P 1 , 'Q', 'R', 'S', 'T', 'U', 'V', 
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Wufll 

2 

0 . 32 ' 

3 

0.84 

4 

0.62 

5 

0.06 

6 

0.78 

7 

0.14 

8 

0.87 

9 

0.85 

10 

0.14 

11 

0.4 

12 

0.33 

13 

0.16 

14 

0.87 

15 

0.79 

16 

0.81 

17 

0.05 

18 

1 


12.3 Recursive Feature Elimination Technique 

RFE technique (orsm^j ^i&nLpd^uu^iLb. Randomforest, 
Decisiontree, Adaboost, Extratrees, gradient boosting QurrdrjD 
algorithms ^nssrn^Qsu features-®? G-g/my Q^iuil/lI) ^sns^dig) SjijDsir Glujoru 
<s£\Gfrni](3jLb. ^ssrrrsc, linear regression, ridge, lasso, elasticnet Qunmro 
algorithms-^ ^^ss)3,uj techniques ^lpsvld rsmL features-®? Q^rjsi^ 

Q&ujg)] GUL^rhia Qsusmf^iLb. £g)/ 5 <g ^jiLuLDnssi^i <sp(fjj algorithm-®? sWsyf/_/TcK/_y 

Qu/oigjd Qs>nsm(^i, spsijQsurrq^ feature-<95(gz_o ranking-®? suLpthi^Qp^i.. $}§>l<sv 
rank 1 Ou/D/pysffsir feature-®? z_oz_!_(5)z_o Q^rfsi^ Q^iu^j rBrnb uuj6$iu(i)\d,3,<sonLb. 
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https://aist.github.com/ 

nithvadurai87/34ca5b0e8a9f5908276240eb099247ad 


import pandas as pd 

import matplotlib.pyplot as pit 

import numpy 

from sklearn.linear_model import LinearRegression 
from sklearn.tree import DecisionTreeRegressor 

from sklearn.model_selection import train_test_split,cross_val_score 
from sklearn.metrics import mean_squared_error 
from math import sqrt 

from sklearn.feature_selection import RFE 
from sklearn.datasets import make_friedmanl 

df = pd.read_csv(/training.csv') 

X = df[list(df.columns)[:-1]] 
y = df['A'] 

X_train, X_test, y_train, y_test = train_test_split(X, y) 

regressor = DecisionT reeRegressor(min_samples_split=3,max_depth=None) 
regressor.fit(X_train, y_train) 
y_predictions = regressor.predict(X_test) 

print ("Selected Features for DecisionTree",regressor.feature_importances_) 
# RFE Technique - Recursive Feature Elimination 

X, y = make_friedmanl(n_samples=20, n_features=17, random_state=0) 
selector = RFE(LinearRegression()) 
selector = selector.fit(X, y) 

print ("Selected Features for LinearRegression",selector.ranking_) 


feature_importances_ <ot^uld method, decisiontr ee-dr^ 

Q&uj<sbuiL(i)\, features-<®@Lo/7<owrankinp-g» Qsus^fiuu^i^^iiLjsbsnss)^ 


ill 





&rT6m(svmi). £§)/ 5 <^ method, linear regression l8§i Q&uj(svuL_rT§]. 

(57 (55tCq(oU RFE QpGVLC) [ 5 mbg,rrsir ranking -go QsuerrlLju(^)^^]LDrTru Q&iuuj Qsusm(^)Lb. 
iSleirsKrrt $j§>l&fl(rTjr5§] Rank 1 Q6i/<syfiLV/JL@<si™TfeatureS-®£> ldlL^ild O^/jsiy Q^ili^j 

UlUffSTU(^I^^SVITLD. 

iftjepidamssr QeusiflufCh): 


Selected Features for DecisionTree [9.52359304e-04 
0.00000000e+00 0.00000000e+00 
0.00000000e+00 0.00000000e+00 6.15147906e-03 
2.23327627e-03 7.70622020e-02 
0.00000000e+00 0.00000000e+00 1.10263284e-03 
2.33946020e-04 

0.00000000e+00 0.00000000e+00 9.12264104e-01 
0.00000000e+00] 

Selected Features for LinearRegression [1110119835267 
1 

114 1 ] 
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13. Outliers Removal 


Outlier <oTssru^i ldidid ^rjs^s>sdl<s^l(^rB§] Qshir^juLd® ^ro^j ^srr&ifl ^(TFjdiQLC) <5/761/ 
^(^ld. 5 , 10 , 15 , 20...75 <oTSS)iLb LPdluiSls^md QairTsmL^Q^d^Lb <5/761/ surflsm&giGrrl&v 
spsirGiB epsiririi LDid($)Lb 15676 67 gn//i) ^rsmssxsmd QairrsmL^Q^uiSlsiT, ^giGsu 
Outlier ^(^ld. £g)65i<5<5 <5/7657 prod aemi—ffl[ b^j &>6$)<snuj GsusmQLb. 


dLfidaemi— ^^rrrjsm^^lsv, 2_6Y76yfz_/7<95 shsrr GarruiSljD@<rir ^(njdigLb Outliers 
5^si/Qsi//70 Column -svild asmi—d)!iuuu /_!_(£) ^ss)su spqjj suemrjuui—LDrra 
QsUSfflUU(^l^^UU(^lS]SSTpSSl. boxplot ^GOGOg] violinplot ^)^/D0L7 

UUJSSTU<^]QsST(DSST. 


https://aist.github.com/ 

nithvadurai87/l 756b2a5ec42 Ifc3f36add04909cc517 


import pandas as pd 

import pylab 

import numpy as np 

from scipy import stats 

from scipy.stats import kurtosis 

from scipy.stats import skew 

import matplotlib._pylab_helpers 

df = pd.read_csv('./14_input_data.csv') 

# Finding outlier in data 
for i in range(len(df.columns)): 
pylab.figure() 

pylab.boxplot(df[df.columns[i]]) 
#pylab.violinplot(df[df.columns[i]]) 
pylab.title(df[df.columns[i]].name) 
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listi=[] 

for i in matplotlib._pylab_helpers.Gcf.get_all_fig_managers(): 

listl.append(i.canvas.figure) 
print (listl) 

for i, j in enumerate(listl) : 

j .savefig(df[df.columns[i]].name) 

# Removing outliers 

z = np.abs(stats.zscore(df)) 

print(z) 

print(np.where(z > 3)) 

print(z[53][9]) 

dfl = df[(z < 3).all(axis=l)] 

print (df.shape) 

print (dfl.shape) 


listl <oT6$iu&,[D(&)6n spsijQsurr^ COlumn-<*(g/_D/rasT susis)/jui^rhiaush 

Qg : LS\&d,LjuL-($)QG$T(B65T. 


print(listl) 

[<Figure size 640x480 with 1 Axes>, <Figure size 640x480 with 
1 Axes>, < Figure s 

ize 640x480 with 1 Axes>, <Figure size 640x480 with 1 Axes>, 
<Figure size 640x48 

0 with 1 Axes> / <Figure size 640x480 with 1 Axes>, <Figure 
size 640x480 with 1 A 

xes>, <Figure size 640x480 with 1 Axes>, <Figure size 640x480 

with 1 Axes>, <Fig 

ure size 640x480 with 1 Axes>] 
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LSldrevrrj SavefigO ^LpsvLb spsbGlsurrd 5 COllimn- 9507 o/ 7<557 GU6&>rjui_(LpLb < 3 id,<sir 
QuiuflQsvQiu Q^LSldaoju^iSlfD^]. dLpd&smL- ui_rhja,®iflGV £g)/_ 0 / 79959570 ^j<r^uu^] 

'salePrice' ■t®t 5 /TS 9 T violin plot toi 7 (jx x ;/>/ 79959570 cjhi—i lj an ^ 9 /(p^ colnmn- 

■95<95/7(537 bOX plot ^070. ^GUIDffl&V (£70/7(51/0/ SpsirSID/DU 79/7/<55779 (/J)0@5) OUtlieT 

£g)0950/_o ^g)/_ 3)6$){f) rsmb Qa^rrsnsnsomb. $lfcJ(3j ScllePrice-<si> 300000- 

■950 QLD&^LD 100000-950 $5(i£TO OUtlieT £g) 0 79 79 3, (Td, Q GUSTfl 79 79 (j))&>§>l U-ISYTSYT^I . 


SalePrice 


500000 - 


400000 - 


300000 - 


200000 - 


100000 - 


0.8 0.9 1.0 1.1 1.2 


SalePrice 



31 (J)0079 L^iurrai £g)00 <55)95/7/ OUtlierS-gc* <oT<sijsurTQj /H 95061 / 0 / <o 7 < 5 S 7 /py 79/7/7959560/770. Z 

Score, IQR Score Qurrmtoe^su ^^rD&rr&ij uujg$tu($)Qg$t(D6$t. Z Score <oTsiru 0 / 

(S£0 0/7(51/ ^09395/7(557 Hiedn LD^I 797®6l5)000/ <57(51/(51/(5)7(51/ 0//777O <5 (STTSyP £g)095£?)9)0/ 

<oT<diu6$)&,d s>smdS]iL(^id 3m.iQ]Lb. 3i§bl&> 3i<sn<sii a,snsrfl ^(r^uusu/oesifD t^mb Olltlier- 

^,95 <57(/J)00/95 QairrmmtsvrTLb. 

0 <57<5577J^<55)<557inean-^ ) 95 <53)Sl/<50/95Q95/7(5337l5) / <J)/6)5)000/ <S£ (51/Q (519/70 0/761/70 <5700 
31 (577(51/950 ^SYT&lfl 2_(Sy7(Sy70/ <57(557790/ 7® <557(519 070/792/. 

print(z) 

[[0.65147924 0.45930254 0.79343379 ... 0.31172464 0.35100032 
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0.4732471 ] 

[0.07183611 0.46646492 0.25714043 ... 0.31172464 0.06073101 
0.01235858] 

[0.65147924 0.31336875 0.62782603 ... 0.31172464 0.63172623 
0.74302803] 

[0.65147924 0.21564122 0.06565646 ... 1.02685765 1.03391416 
0.23194227] 

[0.79515147 0.04690528 0.21898188 ... 1.02685765 1.09005935 
0.23192429] 

[0.79515147 0.45278362 0.2416147 ... 1.02685765 0.9216238 
0.2319063 ]] 


Ghsurryib Gw/oasmi— ld^iul3ss)ssi LDLbQib ss)su^^i^Qs>nsm(^i, Olltliers-g Q&n<sb<sS) 

si5?z_ (Lpi^iurr^j . threshold-^* ^9/«n/jD<g5<g5 (Ssusmi^ub. Qurr^jsunai 3 

aojssiu^i threshold-^/,* ^ssiLDiL/ib. 3 -d(&)Lb Qldsv ^syts/H ^(njuusmsu 

(oTsbeomb Outliers ib . <o7w(3si/ £g)/ 5 <g Olltliers-g LDLbQbib print Q^iusu^jb^nssi 

g,Lbi_G$)6n iSlsinsiJQijLDrTiru. 


print(np.where(z > 3)) 


(array([ 53, 58, 112, 118, 151, 161, 166, 178, 178, 185, 185, 
185, 197, 224, 224, 224, 231, 278, 304, 309, 309, 313, 

321, 332, 336, 349, 375, 378, 389, 440, 440, 440, 473, 

477, 481, 496, 496, 496, 496, 515, 523, 523, 523, 527, 

529, 533, 581, 585, 591, 605, 608, 635, 635, 642, 664, 

691, 691, 691, 769, 769, 798, 803, 825, 897, 898, 910, 

1024, 1031, 1044, 1044, 1061, 1169, 1173, 1182, 1182, 1182, 
1190, 

1230, 1268, 1298, 1298, 1298, 1298, 1298, 1298, 1350, 1353, 
1373, 

1373, 1386], dtype=int64), array([9, 9, 9, 3, 9, 9, 6, 8, 9, 3, 5, 9, 
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1, 2, 9, 9, 9, 3, 6, 9, 9, 

9, 1, 9, 9, 0, 9, 9, 1, 2, 9, 9, 9, 9, 1, 2, 3, 9, 9, 1, 2, 3, 9, 

2, 0 7 8, 9, 9, 6, 3, 3, 5, 6, 8, 1, 2, 3, 3, 5, 3, 5, 8, 5, 2, 5, 

2, 5, 1, 2, 8, 3, 9, 1, 2, 3, 8, 5, 3, 1, 2, 3, 5, 6, 8, 5, 3, 1, 

2, 9], dtype=int64)) 

GjLDjn&smL- Qsusdludidispeb £g) pern($) arrays!) 2_<sy7w<5<o5)<5<95 ^su&sddas^id. 

(Lp^eb array () -si) Outlier ^«»/_D/ 5 d£/<sy 7 w £g)z_<gj £ 5 )sot TOW ld^Iulild, £g)/rssk/_/ 76 i/dg/ 
array( ) -si) ^ 9 /<ssot column- LD^)/j/_//_D <9,fT<smuu(^iLb. <5 t<o5tG>(oli print(z[53] 

[9]) (ojssid QanQdt&fLbGund)] 53-si/dg/ TOW, 9-<su§] Column- si) ^qttqttZ core 
ldZbIuli 3.647669390284779 stsot Qsusrrhju^eusm^d gursmsvmd. 

d,6$)i_d)ujrr&> 3-d(&)d dLp ^sytsyt ld^Iuli^syt ldlL(^) id spq^ li^Iuj dataframe-si) 
Q&LSld&uuid® ^ssisuQuj Outliers fid&uuidL- {f,rj6n&GfTn&> Q^LSld^uu^dlssriDssi. 

dfl = df[(z < 3).all(axis=l)] 


srfflsrCSsi/ u&ftLpLu dataframe-si) 1460 rows ^Q^uus^^mid, n^im dataframe- 
si) 1396 rOWS ^(r^uus^^iLjid ginsmsvmd. 


(1460, 10) 
(1396, 10) 
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14. Explanatory Data Analysis 


n BLDgi {Bfjsy&srT 676U6U3Q] ^65 >ld^ 3j shsum 67657 stf/fleurra ^umurB^i urr/juuQ 3, 

Explanatory Data Analysis 

14.1 Univariate 


spGrj sp(/]j column-si) 2_<sy7w 3,rj6i]3>65>6n ldl-Qld 67® ^pmuGug] univariate 
stss761//_o, g) rjsm® COlumn-si) ^snsnssisu 676U6fil3,55®6V s^ssiQporrQi^nmQj 
Q 55 m_rjiSl65)657 < 57/0 uQiLBg]® 657/0657 67657 ^rj mu 6U3)j bivariate stsstsi/lo, usvQguqj 
columns g) (S3) (oW [B^J GTGU GU IT fOI Gp (Tty target column- sir lS^j ^333^65)^ 

<oj ibu ($)&,&,]&) ing] srsfiTu urrrjuu^] multi-variate analysis GT GtfT GLj LD 

^]65)\Q33UU®LD. 

histogram. Density plot ldqqjlL box plot ^Siiu&nsu univariate 
analysis-*^ Glufl^jih 2 _< 56 i/®s 57/0 gugsiijui— Gus&iaasrr ^(gfLC). 

Histogram < otsstu^i 6 £0 variable-si) ^snsYTsufos^fD, usvQguqi bins-^su 

GpsilQsUfTQT) bln _ SJ)/Z_D 676U6U 3Q] 3,/761]36fr ^65)LD73^]6Y76f7657 <5T 651U 65)3,3, 

3nD®S)/oa y. &Lp3,36mL_ 2_3,mj655T3,§>\6b, 'GrLivArea' sissy/_o Column-si) usvQguqi 
6^^15/65)165) l_UJ Sgft 3916fT6l]3>6k G)33®33UUD® 67763657 . 39]65)6U 500, 1000, 

1500 ... 3000 6T65)i ld usv(]>6urni bins-^j,a/j iSlifld&uuiJ-Ch). epGuQGurrcm bin-sy/io 

673,3,65)657 6)9®36h 65) LD [5^677677657 6T65TU3 y GU65)[JUL _ LDI733 <95/7L_i_LV U D® 6776773 ,]. 

matplotlib LD/DQJLD seaborn ^®UJ65)6U ^3>/5<55)3UJ 6U 65) [J LI 1—31365)677 

6ULp 31(3}® 657JO 657. Histogram 67657U3f] matplotlib 6ULp31(3}®657/D 

6U6S)ijLji Gi ldgsFigv, Densityplot sissi/j^y seaborn gulp ring,® 657/0 gugsxjui—ld 

3&@LD. 
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Boxplot <oTssru§]Lb spQrj s £0 variable-®*) analysis Q^iusu^jdi^ s^qt, 

6U6$)TJUL_ 61/<o5)<9> ^010. $)§>I6V SpQT) QuidlS^ GuTTSSTTD UI_LC) (S^SSTQJ &TT&mUU® LD . ^\^SST 

r 5 ®<sfihsv 2 _sytsyt Garr® 3,n<di median <^ 0 z_o. ^tb^lj QuLLuj.d(^ Gld&^ld, §>(LpLb 

2-GYT6YT G&TT®, <oTjT){f, ^6)761/<950 ^/7<Si/<55<sW UIJS)dIL/SYTeYT^] <oTGSTU6$)3)d <95 /7ll (/J) LO . <jy/70<® 
(S<9>/7L_/JJ_<oS7 toTSV&DSV&DlUlL/LD &,TTSmLq. <95175/0 <95 TTSmUU($)LD <S £0 ^)<SU dl/^ku HSTTsdiS,Gsn 

outliers <^/, 0 ld. 
https://gist.github.com/ 

nithvadurai87/5be067164741348c6a51d6af6d8d78b7 


import pandas as pd 

import matplotlib.pyplot as pit 

import seaborn as sns 

df = pd.read_csv("14_input_data.csv") 
df = df.fillna(G) 
df = df[:100] 

y = [i for i in range(0,10)] 
fig = pit.figure(figsize=(8,6)) 
ax = fig.add_subplot(111) 
ax.set(title="Total Living Sq.Ft", 

ylabel='No of Houses', xlabel='Living Sq.Ft 1 ) 
ax.hist(df['GrLivArea']) 
pit.savefig('Histogram.jpg') 

sns.distplot(df['GrLivArea'], hist = False, kde = True, 

kde_kws = {'shade': True, 'linewidth': 3}) 
pit.savefig('DensityPlot.jpg') 

fig = pit.figure(figsize=(8,6)) 

ax = fig.add_subplot(111) 

ax.set(title="Total Living Sq.Ft", 

ylabel='No of Houses', xlabel='Living Sq.Ft') 
ax.boxplot(df['GrLivArea']) 
pit.savefig('BoxPlot.jpg') 
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14.2 Bivariate 


variables OTsi/si/mry Q<g/7z_/r/_/ Qan<5m®<srT&rT6Kr <oT<ssi sussirjui—Lb GuemrjrBg] 

unrjuug] bi-variate analysis <^0 ld. ^g,6irX.-<3l& : &6V s^eirruLD Y-^&tdsv 

LDjDQfDmsirrriiLb &r>su^^] GU®s)rjui_Lb GusmijujuuQLti. 
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6p<si]G)<surT(nj sAlig-gn/siom-/ SCjft ^srrssisuu GluTT^d,^ ^^ssr sfilfDussxssr sSlssxso 
<oTsi/si//7ffy LDTTjQjuQdljDS}! <oTS$ru3}i Scatter plot, heatmap ^Shuss)su ^LpsvLD 

SUTld-L _ UUL-(j)IGYTGYT<o$T. HeatMap-si) ^IJ6m(j)l «U6V>IJULIl_th]&6YT 2_STTSTTSST. S^SSTg^ 

Seaborn suL£rhi(S)d>l sirin sussiTjui—LDnasyLb, LDinQjDnsirrriJ matplotlib suLprhii^SlsiriD 

SUSV) IJLJ L _ LD TTSiSl/ LD 2_STTSTTffJ. 


Scatter plot stsstu^j ^rjsi^s,sb £g) < 75.95 @10 £g)/_<gs5)<g s,<ssf}d,&,<ss?} Lisbsffls,snrrs,d 
s,md($)Lb. ^j^lsv ^rjsi^si&nsnd (^r^uiS)(^)su^fD(^ Ljshsrrlsii^d^ u^lsvrrsi, dlrrydliru 
sulLl _ fhJ3i<aS)srrQujn ^svsvsj] Gsurru dlsv suis\.suthJ3,ss)snQujTT s^L- uiusmu^l^^svmdi. 


Heatmap <orsiru^j 2 dimensional data-ssxsi/ sus^rjd,^] 3,mdL_ 2 _<ssiy/_o 

SU6SUJUUL- SUSS)3, ^(S^ld. ^j!hl(3j 12*12 ld^Iuli Q&rrsmL- sus&rjui—Lb 

su <S5) ijuj /_v /_/ l. ( 5 ) sttSYTS j]. Matrix-si) 2 _GYT&n s^suQsuttq^ ^ssf]^^esf] ld^Iulild ^ssfl^esf] 
rfl/Dd>&>rrsv (&,//)) dsuju® ld. Qurr&j suits, tbld^i ^ijs^sish stsusSJ^^^Isv 

^ssiLDrB^jshsYTSifr <oTssrd s,nsm 2_<gsiy/_D. Seaborn ldjd^jld matplotlib suLpihn^SlssrjD 
£g)/ jsm ( 5 ) suss)s,ujnssr heatmaps ^)/e/@ Glsin^idsaj u lL(^)sttsttsst. 

https ://aist. aithub. com/nithvadurai8 7 / 

d93a853d86cf5500011cb41308ddl935 


import pandas as pd 

import matplotlib.pyplot as pit 

import seaborn as sns 


df = pd.read_csv("14_input_data.csv") 
df = df.fillna(G) 
df = df[:500] 

fig = pit.figure(figsize=(8,6)) 
ax = fig.add_subplot(111) 
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ax.set(title='Living area vs Price of the house', 
xlabel='Price', ylabel='Area') 
price = df[ 1 SalePrice 1 ].tolist() 
area = df['GrLivArea'].tolist() 
ax.scatter(price,area) 
pit.savefig('ScatterPlot.jpg') 

df2 = pd.DataFrame() 
df2['sale'] = df['SalePrice'] 
df2['area'] = df['GrLivArea'] 
fig = pit.figure(figsize=(12,12)) 
r = sns.heatmap(df2, cmap='BuPu') 
pit.savefig('HeatMapSeaborn.jpg') 

fig = pit.figure(figsize=(8, 6)) 

ax = fig.add_subplot(111) 

ax.set(title="Total Living Sq.Ft", 

ylabel='No of Houses', xlabel='Living Sq.Ft') 
ax.hist2d(price,area,bins=100) 
pit.savefig('HeatMapMatplotlib.jpg') 
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Scatter Plot 
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HeatMap - Seaborn 
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HeatMap - Matplotlib 
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14.3 Multivariate 


^psmQdi&fLC) GldjduiJ-l- ld^iuli^ssksytu Qurrru^^j epQT) taraget variable 
<oTsi/si//7ffy <ot<o37.s <af7<s3k/jG^ multi-variate analysis ^(gro. Parallel 

coordinates <otsstu^i multi dimensional data-« 06 i/<s airrsmu^fd^ 

2_<g<Sl /LD GUSmiJUL- 61/«f><95 


plotly LDjBQjLCi matplotlib ^Lpsvil ) £g)<5<g«n<95UJ suss)rjui_ihjg,sh sus^ijrB^] 
&Lh_uuLi<$)6h6rTgj. 'SalePrice' <ot^jlL categorical variable “ <3-> (3y &) (J61J& (oYT 

srsijisurTrii Srjrr&u urjs&iLjsherry] <oT<sstuss)^ £g)/ 5 <s Gus$)[jui_Lb <s/7l_(5)z_d. £g)«o <5 su^^j 

$l$ 5 l<sb sj^rrsu^j trend ^mm^rr bemusing, rsrrLC) asmi—fflujGorTLb. Plotly QpsoLb 
sussirTiL/Lb Gung], epsijGhsurTcm COlumn-sy/io ^sytsyt min ld/d^ld max /_o£i)LV/_/<s«r><sy7 
<3)&>g$t range Qg,rr(^]ds,ijuLG(^snsnss)^ gusus^sfldais^Lti. £g)/5<5 sussxjui^Lb <s£0 

html (3<g5/7/j/j/7<g5 interactive (LpssipiSisv G^LSid^uu^Qro^i. 


https://gist.github.com/ 

nithvadurai87/2b0bb469694d33c7dl472880fl0f67el 


import pandas as pd 

import matplotlib.pyplot as pit 

from pandas.plotting import parallel_coordinates 

import plotly 

import plotly.graph_objs as go 
import numpy as np 
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df = pd.r ead_csv("14_input_data.csv") 
parallel_coordinates(df, 'SalePrice') 
pit.savefig('ParallelCoordinates.jpg') 

desc_data = df.describe() 
desc_data.to_csv('./metrics.csv') 

X = df[list(df.columns)[:-1]] 
y = df['SalePrice'] 

data = [ 

go.Parcoords( 

line = dict(colorscale = 'Jet', 
showscale = True, 
reversescale = True, 
cmin = -4000, 
cmax = -100), 
dimensions = list([ 

dict(range = [1,10], 

label = 'OverallQual', values = df['OverallQual']), 
dict(range = [0,6110], 

label = 'TotalBsmtSF', values = df['TotalBsmtSF']), 
diet(tickvals = [334,4692], 

label = 'IstFirSF', values = df['IstFirSF']), 
dict(range = [334,5642], 

label = 'GrLivArea', values = df['GrLivArea']), 
dict(range = [0,3], 

label = 'FullBath', values = df['FullBath']), 
dict(range = [2,14], 

label = 'TotRmsAbvGrd', values = df['TotRmsAbvGrd']), 
dict(range = [0,3], 

label = 'Fireplaces', values = df['Fireplaces']), 
dict(range = [0,4], 

label = 'GarageCars', values = df['GarageCars']), 
dict(range = [0,1418], 

label = 'GarageArea', values = df['GarageArea']), 
dict(range = [34900,555000], 

label = 'SalePrice', values = df['SalePrice']) 

]) 

) 

] 

plotly.offline.plot(data, filename = './parallel_coordinates_plot.html' 
auto_open= True) 
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15. Polynomial Regression 


Grsrj Qs>mdis^<sb Qurr^rB^rr^ dld^sonssr 3,rj<sija(G]njd(& ) polynomial 
repression-sou uujssTu^^^sdmb. dL^dasmi— rffrjsSlsv <s ^0 <si?u_i!j_/D< 95 / 7 w 
zSyii)-iLiLD, ^{BinansKr sfihs&KSViLiLC) Q&rrQd&uuL-QsnmS)]. £g)£§)<si) linear ldjd^jld 2 nd 

order, 3rd order, 4th order & 5th order polynomial Qurr^^u 

urrrjd&u u (/J)£) fD^]- 


https ://gist. github. com/nithvadurai8 7 / 

b7d3bf7733b5d4a8d2c8b2dlb8dcb531 


import pandas as pd 

import matplotlib.pyplot as pit 

from sklearn.linear_model import LinearRegression 
from sklearn.preprocessing import PolynomialFeatures 

X = pd.DataFrame([100,200,300,400,500,600],columns=['sqft']) 

y = 

pd.DataFrame([543543,34543543,35435345,34534,34534534,345345],columns=['Price' 
]) 

lin = LinearRegression() 
lin.fit(X, y) 

pit.scatter(X, y, color = 'blue') 

plt.plot(X, lin.predict(X), color = 'red') 

pit.title('Linear Regression') 

pit.xlabel('sqft 1 ) 

pit.ylabel( 1 Price') 

pit.show() 

for i in [2, 3,4,5]: 

poly = PolynomialFeatures(degree = i) 
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X_poly = poly.fit_transform(X) 

poly.fit(X_poly, y) 

lin2 = LinearRegression() 

lin2.fit(X_poly, y) 

pit.scatter(X, y, color = 'blue') 

plt.plot(X, lin2.predict(poly.fit_transform(X)), color = 'red') 
pit.title('Polynomial Regression') 
pit.xlabel('sqft' ) 
pit.ylabel('Price') 
pit.show() 


linear regression-g) ssisu^^ju QurT<r^^^jLb Gurr^j, ^&,rD3>n<s5i Gs>n(^i <oT[Ba, epqjj 
&,rj6n&><sfrl6$i lS^jld Qunq^rB^mDsv iSleirsuQ^LDrT^j . ^j^jGsu under fitting 

<oTmuu®Lb. 



(oTssrGsu 2nd Order fwssmudlsv Cube asmtSliSlLq-dauuiJ-® ^Gu/nemro 

Ql//70^^ (Lpiu&^JLCiGurT^i iShsirsuQ^LDrTrii ^ss)LdQ(d^i. ^j^jGsu non¬ 
linear function <oTssiuu(^iLb. ^j^rrsu^j epQTj Grsrj G3>m_n&> ^ssiLDiurr^i. 


133 










^sijsunQfD 3rd Order-si) ^/j'isiy<g5(SY5«f)/_uj Cube 3sm(^nSh<^d3LjuiL(^\ ^ss>su 

■5/761/<95(677,<*0 65765)/Z_0 3JDQJ <J>/0®SI) Q <9751)61/65) <5 <95 95/76557SI)/7/_0. 


3><i>5)L _ SdujrT3> 4th Order-si) ^65)657<50/<5 <5/76i/956yfl657 lS^ild (Lp(Lg3,rr3>u QurrqTjrBgjLDrTrry 

non-linear ^ssildSiid^j. ^gjGsu over fitting srsw/ry ^smLfid&uuQLb. 
^glGumsirjD Over fitting-/_o ^fhunsifr^i ^jsvsv. 


67657(361/ <5irF>3, order-si) <J)/65)657<59i/<5 <5/761/956)^657 lS^JLD, [BLDg] nOn-linedr UIJSU SI)/79> U 
Qurrq^rB^jdlfDQ^rr (over fitting <jysi)6i)/7/_Dsi)), prun aeftsfluiSl/Dig) 67(/J)<59i/95 

Q<95/76Y76Y7S1)/7LD. £g)/_£) (//) 65)/£)//5)si) 6£0 (oismswflfD(^ ^(5)<5<5(/J)<5<5 LDI_rEJ(3,3,63 
9> 6557(/))/_S)/_!/_9i9>/_!/_/ (/>) 61/9>/7Sli) , 3) 3)3,36$! 3LD6$TU!T($\ <3)3)G$T LDI_rhJ(3,36$)STTU QU /7/TJ7<59i/ 

/_S) 65761/0/_D/7/p/ ^65)/_O®/D0/. ^/^)<95 <J>/6Y76l5) si) 6765W956Y7 <J>/£5)95/$9595/j/J <$£ 113,360 feature 
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Scaling-<S57 uiusirurr^) $jfei(3j (LfiddUu^^jGULD Glu^jdlro^j. 



15.1 Underfitting - High bias 



g&ssfJULjdd,nssr (3<s/7z_/7«j7^y aBrje^ansrHsir lE&j ^dd&LDrr&Lj QurrQ^rB^rr^ rfl&nsvGhu 

underfitting <orssiuu(^iS]rD^i. ^^13 features 

Qgrrem® &>6$sf)d(&,Lb Qurr^j ^jrBrffssxsv <^rfDu(d)^iD^J . ^j^jQsu high bias iSlrjdge&KStr 

(oTtoftTITjJLD ^SS)Lpd3,UU(^]Q(D^l. <ojQ &ST&sfl GV LS\gd (3,G$)(D[53) ^STTGl^ ^Lb&thJ&GSXSnd 

grrrjrBGa, QgujGVui^^fD^J- ^^rrrjGm^^jdi^ 50,000 <5/7siy<s(g5<g5@(m) ^rjemGi— 
£g) pern($) features-^* Glgrrem® gsssfld^Lb Gung^i ^rjGi^gen Grassy ld Qgmdi^Gb 

(o)urTQTjd>d)ir&j]. gtgstQgu ^g^Qurrmro iSlrjdaGDsmd^ ^/jei^g&dsm <57GmG^sfldss)3ss>uj 

^rj^Jrrgrr^]. features- sir toTsmstfsflds&i&smu LDLdQGw ^j^lgfldg 

QGUGm(^)L£i. . 
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15.2 


Overfitting - High variance 



^gj/sirsiy features-®? Q&ijuu&jssi ^LpsvLb underfitting-®?<g {fjsbbtjd&somb <ot& n 

(oj/bQ^sifT(3su urrrj^Q^mb. ^^](3su ^snsi^d(^ ^/^)< sld / 7 <s Q^tj^^i&Shbi^rrsb, 
overfitting <orssip rffsir>sv (ojibuiLQb <sfil(jb§>ljD3i]. <g£\{f,6$)<s$Td, JBisfilrjuLjgijbana 
( 3 <?/ 7 <s<g 5 /j/j( 5 )(si/G^ regularization parameter ^(^ib. ^^msu^i ^rrsii^sdim 
<oT<smsstsf\dss)S> (^&s)P0>surT3, features ^{slaLDrra ^j(r^d(^ib(3un^j ^rsrfhssKSO 

<ojjbu ($) ib. 2 -g,rTrj 6 mj£g f ]b(& ) Ghsurruib 50 250 features Qarrsm® 

&,Gmf]d(gjLb(3urrg ) i (3G>m_rrG$r& t i, ^tssxssrdigib, iSa^ib ^snsi^d^ ^^lanDrraiLj 

Qurrq^rB^iSljD^j. ^j^jQsu high variance <oTssr^i ^ss)Lpd,s>Lju(^iS]jD^i. 

&,6£hjd&> features <5Tsm6v$flb&r>a6mu /jS)<ssiy/_o (&,6$)ir)d&>rr<spiib high bias 
^§3ebb®§>}£[)§]. $)&ii(3<su bias-variance tradeoff < 57 w/p/ ^^Lpd^uu^idiiD^i. 

QurrsirjD ibbrjd&Gsxssi&GsxsfT ^siSlijda, features loTsmstfsflbs&iasmu &rfhurT6Kr ^sns^dig) 
(&)6$)[Db3> GsusmQib ^svsv^i regularization-g>/j uiusiru(d)^^svmb. 
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15.3 Regularization 


£g)#i/ s^si/Qsi//70 feature-ayf-CT/ii) ^s&ismdauuQLb parameter-aw 
(^L_L/7<95<SSYflaw) ^ STTSSlGud (^S^/odSlfD^J. <oTGStQgU features-aw (oTSmSStsfid6$)3i 
lamma; G^cjBd&rTGViLCi. ^ssxsu ai&ssfluiSlsv (^s^ir)[B^ ^gytQgu uthjQ&tDiajLDrrrQ] 
Q&iuujsvrTLb. linear regression-ai//_aw£g)0/ ^jssKsmiL/Lb Guna^, s^gjiD&nm 

&LDG$TUrr($\ iSlsirGlJQTjLDnjQI <3)6$)LD&>I(D3)1. 


Linear regression: 



E (M*“>) - n ™) 2 + A £ *? 

' ‘ 3 = 1 


n 


i=l 


£g)^)ai> evmLi_rr (oTssrugipmsir regularization-sswaw parameter, £g)< 5 aw ld^Iul\ 1 
eSlqijrBg] ^awaw^g/ feature -d/^LD ^ssiLDSusin^d anrsmsi^LD (j =1 to n). 

GjQsSTGsf]si) XO -SST LD^IULf <oTLJ <3 UrTffJLD 1 <oTSST ^j(^d(^QLDSSTU65)3} ^7®<95aw(3ei/ 

ansmGi^md. ^jyS.Qsu ^l_/_wO -siyaw/_uj m^lus^ud (^sm/odgitB G^&s)guu51gvgs)gv. 


^G^Gurrsv <su/7z_o/_/7ei5)aw ld^Iuli iSla, <3i$5ld,LDrra,GHLb ^(Vjd&d a^ /_/70/. /j5)<s<s 
(3)<5&)fO<surrg;<sijLb ^q^da>d a^i_naj. (a^sm/DGurra (g)cjhda,rTGViLD . OVerfitting-®£0 

^&Shjds>n§] . ^jSlaLpna, (j^ffhiiairspiuj bias <§Tjnui_d s>mjGmLDrrS]&S\(^iLb. <oTgstGgu 

&rfhunm^jq^dai Geusm^udi. 

Gradient descent- siy/_aw regularization ^GsxsmiuLbGua^], ^^tDa^rrm 

a : LDsirun(^ iShsinsuQ^LDnru ^&s)LDuqLCi. <g£i(h]qf)Lb d>L-i_rrO ~gl/l _ sir ^jsvxossruj itlcigv, S>lL 

1 -<si5)0/59i/ regularization ^jsinsmdaiuui^Sip^j. 
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COSt <5 <o8W"(J)l5)/^_/_7z_/^ j /j3<s/7<ss7 ^n^mjsssr rj&>3)]i _ ssi r6Cflllcl.riZcl.tion 

^s^smiLjLbQurr^], iShsmsuQ^LDrrrii ^s^ldilild. 


Normal Equation: 
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16. Logistic regression 


lBLDg] &&tfsfluLI (SpQJj (Lp(Lp LD^lLuSlSS)SST Q<SUeiflLJU®£f,g,rTLCiGV, < 5 j(J>&,GS)lLD 6 p(Vj SUSS) 3 ,uS\SST 

s>Lp <3i6v>LDr5g,rr&v, ^>/^/( 3 (si/logistic regression <o7w/j/_/(pz_o. £g)/5<$ 
sussianju^l^sv, binary z_D// 5 /ryz_o multiclaSS OTgn/fi) Gfil&jriignsiflsi) 
f5ss)i_G)uQ]ii>. logistic regression <otg$tu gi ^^s^Susirir) s^0 algorithm 

^,0z_d. Quiuflsv lol!_(5)z_o<5/7<oOT regression 67go;/i) <5ijrTrjj56S)f5 ^snsngj. 

^ssmeb spqij classification-<s<95/7«s7 algorithm 


h(x) = 


o or i 

0 < h(x) < 1 

S(z) 

g{9oXo + 0[X i + 

g(o T x) 

l 


+ 0 „.r„) 



1 + e 


-0 T x 
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16.1 


Sigmoid function 


<s£0 sfilsyiUJLb rssini—QurruLDrT? rBe^i—Glurorr^rr? ^svsv^] £g)0<s<s/T? ^svssxsviurr? 
<oTG$tug$)&)Quj £g)0/ 3 iswsfJdS]fD^]. ^/i) srsirugj] 1 <o7wsi//_o ggjsv&n&v <oTsstu&j 0 <o7wsi//_o 
3,Smf\ds,lju(^\Lb. ^< 95 ( 3 ( 51 / £g)< 5 S 57 &<5mfluun6VTg J j O-<Sl 5 ) 0 / 5 £i/ 1-GU61DIJ <3]61S)LDtLILD. 
£g)<5/D<sf7w si/S!r>/j7JZ_z_o iSlmsuQTjLDrrrri]. <jy/5<g suss)fjui_^^lsb Z-sst ld^Ius^uu Qurrruff,^ 
a<5tfsfJdg;Lju®LC) g(z), O-Qpgsb l-sussirj ^ssilduj GjGusmQQLDtsvfhsv ^^jDS>ns5i 
(&,£&)}tjLDnm&j 1/(1 +e**-z) toTtsvTrry < 3 i®sumLiLb. £g)0/(3si/ sigmoid function <oT<3irgn 
^ss)Lpds,uu(^]QrD^i. 


(oISSlQsiJ Z”t9>t9j/T(o9T _ &j^l(5V h(x) -®r>/j Qun(rT)d,{d\6$m<sb, <jy 0 /O-l si nsmrj 

^gV«n/_osi/<5/D<s/7<S37 &LD6iruni_ng; iSlssisuQ^Lb [nd ^ssildil/ld. £g)0/(3si/ logictic 

regreSSion-<s<s/rasr ^LDsirurT® ^(^ld. 


s^0 LSlesTGSu^&sb spam-^, ^Gvemeviurr <orssrd ansmsfluu^foairT&ST rffrjsv iShsmsuQ^LDrrrii. 


https ://gist. github. com/nithyadurai8 7 / 

f09984303f976ca6eb8a64a4b7f0e391 


import numpy as np 
import pandas as pd 

from sklearn.feature_extraction.text import TfidfVectorizer 

from sklearn.linear_model.logistic import LogisticRegression 

from sklearn.model_selection import train_test_split, cross_val_score 


df = pd.read_csv(/spam.csv', delimiter=',',header=None) 

X_train_raw, X_test_raw, y_train, y_test = train_test_split(df[1],df[0]) 

vectorizer = TfidfVectorizer() 

X„train = vectorizer.fit_transform(X_train_raw) 
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X_test = vectorizer.transform(X_test_raw) 
classifier = LogisticRegression() 
classifier.fit(X_train, y_train) 
predictions = classifier.predict(X_test) 
print(predictions) 


['ham' 'ham' 'ham'] 

16.2 Decision Boundary 


h(x) — 1 <oTSSIU(oTuQunStfLb <oTSSTU 6$)^QlU @n5ld(§LD. <o7w(3<oL/ l-h(x) (oTSSTU^I 

{gj&v&n&v erehussi^d (&)rf)}d(&)Lb. 2-g,rTrj6md>a)]d(g)h.(x) erehu^j [Bnesien mssiLp Quiuuj 

70% sumuuLi <oTsst 3,smf\dQr[)Gl^esf]Gb, lS^(L pehen 30% ggjeve&sv erehu&n^d 

gusvsfl^gtjeheng] ersirQir> ^>// 7 < 5 < 5 lo . 

d,ijepa,eh SdLpd&emL- eues)ijui_^^leb <* rremuuQeu&jQurrsv u/jeusvrrai 
^6^)LDrB^l(r^dd]fD^] ^resfleh, erd,/D(&) (Sldsv Q&ehrorreb ermd aiswsfld^evmdi, er^jni^d 
dip ^ssiLDrB^rrsv ^jev&nev eresrd 3,e$$f)d&><sonLb erehuemg, (Lpip siy Q&ujeuGa, decision 
boundary ^,(3j to. enjQurT^jLb $5Lhi_rr LD^luL/anssierru Qun^i^Q^ ^ssildilild. - 

3, 1, 1 <oTgn ild LDdlijL/a&neYT didi—rrO, jdidi—rrl, ^Ldi^rrZ ereviLfih—ddiev 

Qurr(rT)d,§)\G5m<sb, h(x) = l eresr 3>e$sf}uu(f,iD(3 ) xl to/D/ry/i) x2-^/ > <S37g/ 3-d(&) Gldsv 
^ssiLDiu GeuemQLD erehues)^ decision boundary-^/,* ^<s&)LDd>g}ieheng}i. 
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<5/761/<956)7 $;Lp<95<95<o5k/_61/f77ry nOn-lmear (Lp<SV)JT)u51sV LJ IJ 6)5) llOl (Tfj U LJ IT <SV , £g)<g6S7 $!/_!_/_ 

LD^tuLiansunsifr-1<6T6 $tu^j 2-lo Order polynomial-si) ^)0<95 @/_d 

&LCi6$iunL-Lq-<sb QurT(nj&>g,Lju®§>lir)3)]. \-<oT<sstu^i boundary-^/,.95 <956sk(J) 

iSiUuiZ(i))GYTGYTdj]. ^j^jQsu threshold classifier <otssr^jLb ^emLfidauuQLb. 
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16.3 Cost function 


2_smss)LLuS\sb [BfTSJnsn ldgsilp Quilhu sumuuLi ^(r^ddlro^j <oT6$t &Gistifldg;ijuL- 
QeusmL^iu^] £g)sv&r>sv gt&st &Gttfld3;ijuLdi_rTGV, 31 &J s^0 error. ^GUGurrQro Quiuiurr^] 
< oTGSTUGS)&} QuiUlLjLb (oTSSld <g> GStifldff)ITSVILD <3]$] GpQJj eiTOF. <3]^nGU^I 1 GTGSTU3)] 0 <oTGST 
aiGissfldanjuLdi^rrQGvrr <3isvsv^i 0 srsiru^] 1 srm 3jswsfld3juuL.i_rr(3svrT 31 &®S)IG$)I_uj 
^sugn <oT&>3)<o5)<o5T 3 : &}6)$‘ji)Lt) rffaiLprB^iGYTGn^] <oT6$tug$)&}& a,GmdQ idQd 3^ro £g )luevrr 

Infinity (<oT6mGmjDfo) <oT6$iuG&, 313,gsi LD^Bluuna ^) 0 <® 0 ld. ^^tD^nesi 

Gil< 0 $) [JLJ l _ fhJ&GYT iSl GST GU (ITj LD IT IQ]. <31^l6VX 67G5TU3}] h(x) <oTGsf]sb, y -^GSTg/ infinity-§0 

Gn,rrd&)d Q&GVGyjLD GUG$)GfTGi]d&>n65T (^^^ujld -log( h(x) ) 



^&,rTGU3)}, 


1 <oTG$TU3)] 0 <o7W 3,6$5f)d3,UUL_l_n<sb 3]{f,[D3>nG5I COSt = -l0(j( h(x) ), < 3 ]GUGUnQ/r .) 


0 (oTgstu^j 1 <oT«j7 &>G$sf]d&>u u L_i_n<sb 3 i^if)s>nGsi COSt = -log(l-h(x) ) 


gtgistQgu COSt-dairrem (^^^irjLCi iS)G5TGU(r^LCirTr)j <3iGiDLCid]fD^]. £§)^>)<si> y= 1 <oT<oS76i//_o y=0 

<o7W61/Z_0 GSIGUfBgfi UrTljd>3)]d Q3HTGYTGYTG1 ]L£i. 
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rn 

J= -^[E y (i) log+ (i - 2/ (<) ) log(i 

i=i 




When y = 1, 

= y . log(h(x)) + (l-y).log(l-h(x)) 

= l.log(h(x)) + (l-l).log(l-h(x)) 

= log(h(x)) + 0 
= log(h(x)) 

When y = 0, 

= y . log(h(x)) + (l-y).log(l-h(x)) 

= 0.1og(h(x)) + (l-O).log(l-h(x)) 

= 0 + l.log(l-h(x)) 

= log(l-h(x)) 

£g)<5/D<95/7S37 contour plots <5p(TFj Slsmsm SUUj.su ^S&lLDUlSlsV ^jSSlLDUJrTLDSV, 
§\(T)] SI/(53)SY7Siy<5<o5)SITU Glu/orry UsbQsU^J <£TJDfD ^)/D<95<95/6/<95S5)Sy7<95 Q<95/7S3377!J_0<9507O . 

^gjGsu non-convex function <oT<smuu(^Lb. ^>/<5/7si/0/regression-<95<95/7S37 
si/<s5)/77J/_<5^)si) s^(S/7 s^0 global Optimum /_o/_!_(/J)/_o s>nsmuu(^iLb. ^ssmeb 
classification-**^^ si/s5)/7/j/_<5^)(sii) usvQsu^j local optimum s,nsmuu(^iLb. 


144 



SjGtSSTSsflSV £g)/EV0 ' £g)0<*0', '£g)(SVS&Sv' <oTSS)ILb (s^tTSmCh) LD&IULjgim LDL.Ch)LC) LDITlfil LDfTl6l 

a<5istifJdg;Lju®<su3,rTGV, usvQsu^j\0Cdl Optimums ^(tyd&lssiiDssi. QurrssijD 

non-convex function -ayiL rsmb gradient descent-g>u uiussruQppsvmL. 

gradient descent-^ #/j3<jot/_//t(5)z_o multiple linear-®*) 

(0<9>{0 LD • SpG[J SpQTj S)d 3j UJ IT3-LD STSISTSIST&ISlJSIsflsVj tl(x) ~ 3>3>[TSfiT d)L / [T~ transposes 

< oissiu^i $}fei(3) Sigmoid function-®*><s Q^nsmLq.i^di^Lb. 


16.4 Classification accuracy 


8,rrss)sn ^smss)LDuSlGsvGuj LD&r>Lp Quiljuj sumuuLi ^q^d^idGurr^i '^jsv&vsv' srsmd 
s>sssf]uu^iLb, ^jsvsvrr^Qurr^i '^j(r^d(^)' srssrd 3>sssf]uu^iid classification- sv 
[bsvil—GlurQiLb < 56 i//ry ^ 0 /i). srmGsu stsususyts ly ^rjsi^^i^d^ pfiliurrm aessfl u/_/<®s/r 
rffaiLprB^ishsn^] srssrd 3;smi_n ! ))sij(J>3, accuracy ^0 /_d. 

s ^0 upg] rF,rrss)sndd,rrssr surr&ffl&rxsv aeufluL/asYr dLpd&smi— ^^mjsm^^lsv 
3insmuu(^)<su^]Qunsv ^(njddljDg] stsst ssxsu^&jd Q^rrsuGsumb. ^^rrsu^] y true -sv 

2_SmS$)LDu5lGsvGuj LD&S)Lp QuiLl^^rT, (^)GV&S)6VUJrT STS$)I LD sSlsVfJLD 1 LDfDrULD 0 ^,a 5 

2_<syra70/. 3>{o&>ns$r 3>stssfJ li li3>syt y_pred - sv ^sytsyt^j . ^su/bsm/D spuiSlLdQu 

umjdi&fLbQurr&j ^usmi—rrsugi, ^rorrsu&j LDjb^ji-b ^Lprrsu^i asttflun 3 ’ 6 ™ LmdQid LDrrrfl 

ijs&)L (olurod)l(Ti)Ljus5)3) 3>sussfld3>sii ld . stsktG> su Q LDrrd>a , 10 {brjsy&srflsv, 3 z_ol_(/J)/_o 

8)6ii(Dna, <3]s$)LDr5$dl(Vjuu8,n&v, $l3,sir accuracy 70% stsst surBgjsrrsngj. 

https://gist.github.com/ 

nithvadurai87/7668ce262ed9070d89bl58bb7fl3c5cb 


from sklearn.metrics import precision_recall_fscore_support 
from sklearn.metrics import accuracy_score 
from sklearn.metrics import confusion_matrix 
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import matplotlib.pyplot as pit 

y_true = [0, 0, 0, 0, 0, 1, 1, 1, 1, 1] 
y_pred = [0, 1, 0, 0, 0, 0, 0, 1, 1, 1] 

print ('Accuracy:', accuracy_score(y_true, y_pred)) 

print (confusion_matrix(y_true, y_pred)) 

print (precision_recall_fscore_support(y_true, y_pred)) 

pit.matshow(confusion_matrix(y_true, y_pred)) 

pit.title('Confusion matrix') 

pit.colorbar() 

pit.ylabel('True label') 

pit.xlabel('Predicted label') 

pit.show() 


Accuracy: 0.7 
[[4 1] 

[2 3]] 

(array([0.66666667, 0.75]), array([0.8,0.6]), array([0.72727273, 
0.666666 

67]), array([5,5], dtype=int64)) 


16.5 Confusion Matrix 


/_5)<S5rSl/0Z_O Sl5)$ 5 )<5ffyf)<o57 Z_/U}_ ^0<a//T<95<S/j/j(p®/£)5i/. 

0 <oTSS)ILb LD&IULj 1 <oTSSI UL-t-fT®) = False POSitiVC 

1 <oT<ss)iLb ld&iuli 0 <otssi asm}! <s 3,uu /_!_/_ rT<sv = False Negative 
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1 <oTgg)//i) LD&luiSlsir gisttfl u Li Lb 1 <oTsst ^ss)LDrB^rrsb 


0 <oT@y/_D LD^luiShsir &,swsflu l//_d 0 <oTssr ^ss)LDrB^rrsb 


True Positive 


True Negative 
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16.6 Precision, Recall & FI score 


Precision (P) 3resru^i 3T3,3,6S)6 $t &&,<333,Lb {ssurDira, ' ^ld' 3resrd s>sstsf]^^isbsn^i 

3T35TU <3S)^lLj LD, 


Recall (R) 3tsstu^i 3Td,&,6$)6$i <906i?<g/_D ^su/orrai '^sv&nsv' 3resrd assissfl^^jensn^j 

373STU65)3)IL/LD <S SmdQ (J)£)/D0/ . <56L//D/7<95 SiStfsfldgiUU L_L_ £g)«j©5) TJSm® LD ££)«_}/_/<95 (off) (Sim//_0 

spGij LD$ 5 luurrg; LDrrfD^jsuQ^ F SCOre ^(^ld. ^d,rr>g;rrm (&,&>§?!rjLC> 

iShsireuQ^LDrrrii. 

P = True Positive / (True Positive + False Positive) 

R = True Positive / (True Positive + False Negative) 

F score = 2 (PR / P+R) 

^<53>su3iS3)snd &sm® iSlL^uu^foairrm (Lpddhu^^jsuLD 3 tsstsst 3tsstqj ^uQurr&j 
urrrrdaisvrrLti. <s£06i/0<950 s_/_ldl5)<sv) ^rrouid^ensn ■*/_!_ is^uSlssr 

^en<33>suu GlurrqL-lfDgU GrBrnudarTssr andL^iurr ^sv&nsviurr 3 tsst (Lpu^si^ 
Q^iuil/ld Q&rr&>s$)6$T6$)UJ 3r(^)^^]d QsifrehQsurTLb. ^3,/D3irrm LDrr^lfl^ gs/jey&srrlsv njjrirsrfilsv 
s£0<si/0<s0 lo/_!_(5)(3lo '^lo' <3T<3S)iLD (LpLqsij <95 rrsmuu®Lb. Qu(njLburrsirs3>LDUjrr<3ST 
fBpsygi&rrltsv 1 ^sv&sxsv' <s7gn; ld (Lpis^Qsu rfis$)fndd}riKd(&Lb. QurrssrrQj spGrj 

(LpLqsfijsDssid &rrrjrF>g, 5 ^snsy LDn^lrflji, &,(j6n&>s$)snd QanemissKSuGuj "skewed 
classes" 3T^TrD3S)Lpds>uu(^iS}^TjDSST. ^su/nemin ssisu^^j iSlfoarr&v^tslGV ^smsDLDiunssr 

g,L-isi-u5lG5T <3]<snG$)Gud &,s$rf\d(&)LbQurr&)i, '^ld' 3ts$tu& ) (D(& ) unsung,' ggjsv&nsv' 

3T<3STU<3D<S (3UJ Q U (T7j LD U fTSiSTSD LD UJ fT& QSUSlflLIU (/J)<90/ LD . £g) SUIT)3DJDd <95 &SST L_ $13U3)fD(&) 

2_<5<si/6i/(3<5 precision ldto^ild recall ^ 0 ld . 
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16.7 Trading-off between Precision & Recall 


<s£ 0 <si/ 0 < 55 )/_uj sudL^uSlsir ^snsy 5mm. -<*0 Qldsv ^q^rB^rrsb LifDQj GrBmudsirTm 
<oT65i threshold ^ssiLDd&uuiLQsksnsjns, ssisu^^jd Gl 3> nsYrQ su n ld . ^uGung] 
£g)/ 5<5 ^>/6)T6iy<g50 Qldsv ^ssirrsb ^rr^rrrjsm sud £g)0<950/_o 6^061/ fill—id Q&mrg] '£g) 0 / 
L-lfDQ] (5/5 mil<*<95 rrssr st&st^ 3)SU(Drr&,d 3^rfil sididi—rrsv, ^sufi G> 3, ski suuSlsvsvrrLDSv 

use susSl l£)( 3j d)&)ds$)33>ss)6fT Qm/oG^nsnsn (Ssusmi^ufilq^di^id (false positive — 
high precision). 


stsktCpsu rf)LCid(3) 2_rru$5lujrT3;d> GgfilrBgnsv iulLQGld '^id' srsmd s^ro Qsusm($)id 

<oT6$iu3,iD3>n3, threshold-^* 7mm -i@ Qldsv ^^l3i uu(d)^^jQ sun id. ^uGungi 
6mm ^snsidsv L-I/Diru Qn,nuj 3>idiq. ^q^d^id s^q^sufili—id G3 : sirrru 2_thi3,^fr,d(^, s^ssr^jid 
1 ggjsv&nsv 1 srssrd 3h±rrtfLb ^umuid QrBQ^id (false negative — high recall). 

£g)<56977761) ^617070 ^ 61) idJdkULD /7<95 £g)0/5#i/ 6l5)(J)617/7/7. 


^,<*(561/precision -god ^ssi/oda sfii^idiSissmsb, recall ^dtiafiidigid. Recall-^* 
069) i ®<®<95 6i5]0/_o/_S)697/76i) precision ^tdi&rfidt&fLb. ^)^yG6i/trading-off between 
precision & recall srsmuuQ&jDgi. 
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17. Multi-class classification 


0 LD/DJJ}] LD 1 <o7W £g)0 /_$)/$ 61/<95 6)T LDL_(J)/_D ^6V6VrTLD6V, UsbQsurU iS)ifl61j&6YT iShsir, 

H^l^rrs, si/0/i) s^ssrrSlssxssr <oT[5d, iSlfilstflsir dLp <3 ]g$)LD< 95<s (Ssusm^iLD <oTsst gusttfluuGig, 

multi-claSS classification ^ 0 /i). £g)^)<si> iSlrflsi^^sh ^jq^ddlroQ^rr, 

< 3 j{£{f, 6 $) 65 i logistic asvsfJuLigi&rT r56&)i_Quiri]LC). i 3 <drssnj n§)\(f,na, suir^dlsirjD s^ssi^i, 

^9/gai ssrddiesrnsonii agiraflaa/j/JLffl . <oT^l<sv ^^laocirTaiu Qurrq^^ffjQroQ^rr, ^[B^u 
iSlrflsnsud Q^mros^i—U-iLCi. 


^LpdaismL- 2_3)rrrj6m{5§)l6b £)■*/_}/_/, mn^rr, udss)^, LDfSTj&m 67go//i) t¥,rrG$T(gj iSlrfhsygisfrlsv 
suss)smuthi3,sn ssnsnssr. 



o o 
0 0 0 
O o O 




o 


(ip<g( 5 l 5 )< 5 i) ^)<S/j/_ 5 )«rxS!J 7 <g 5 ■S«f 5 fl/j/J< 5 i ®< 95 / 7 «J 7 hypothesis 2 _ 061 // 7 < 95 <S/j/_/(/J)Z_O. h(x) 

= 1 <oT6577_0/ §\3,ljl5l6$)6$T& (J^ffid^LCi. dl3iU Lj ^ 6V6V ITdj ^6ti)<o$Tg>gj] LD 0 -^jySV 
0 /? 5)<95 <95 LV/_/(/))/_D. 


<S>]®3)3)1 ^MI^rTSSKSud 3>6$sf]uUg,ff) <95 nssr hypothesis 2_0SI//7<®95/_}/J(5)/_D. ^)^)si)h(x) — 

1 OT657/_/0/ 2M^rTSS)Sud (^f^d(^L£i. 2esg,rT <S>]6V&VrTg, <J>/65)637<50//_O 0 -^,SV (&,rf)ld3,lju($)Lb. 


)sijsunfDrT3i ^>/i 5 )^< 5 ( 5 )< 5<5 rfffothi&isrjjdig) hypothesis s_ 0 si// 7 <s 95 /j/j( 5 )/_D. 
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o O 
0 o 
o 


o 0 
0 0 0 
OoO 
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0 °o 
0 0 



o o 
0 0 0 
O 00 
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o u 
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0 O 
O 0 

o 
o 

Oo 


5 o 
O o 


o 


o°°o 

0 o 


o O 
O 0 

o 


0 o 
0 0 0 
OoO 


o 

Oo 



iShsirsmrj, l /££)< 5 / 7<95 < s £0 sussxsmuLb sutr^QroQ^ssflsb tdaijurra 

&6^d&uu@«ijg,ing;rTm &rT&>$5lujLt) 30%, mag,rrsurra asmfldauui^su^fDarrssT ^rr^^lujLD 
40%, uass)aujrra asvtifldauu®sug,inarTm arr^^lujLD 60% LD^asnrra 
ass^ldauu^su^roarrssT arr^^UuLb 50% srm suq^Qro^Q^ssflsb Qg, , sr^sir g : g^^lujLb 
^^laLDrra ^(r^ddl/oQ^rr, ^rB^u iSlrflsfilsir dip ^ssildilild. £g)0/(3<si/ multi-claSS 
classification ^^iL. 

Decision tree, gaussian NB, KNN, SVC ^dhu&nsu iQu^sirro multi 

class 0/63>6w/_//fluyLD algorithmns <^0 id. <s^0 LD<svrf LDSv&Shuir, QrjrTgorrsurT, 

g,mj}6$)[jujrT forestry] ^rjL~c>n<ssf]uu^ri)3>nssi multi-claSS classification 
iSlssrsuQ^LDrT^j. £g)«n su usvGsurry algorithms ^Lpscid r§laLp^^uu(^iS}ssipssi. 
^smsuasiflsv ^{BlaLDnstr SCOre LD/orryLb precision&recall Qg>rrsmis$)g) rEmh 

G>g,rjsy Qg 1 iuiu svrrLb.. 

https ://gist. github. com/nithvadurai8 7 / 

aaded978eb7e545006ed6117c97b86b3 


from sklearn.metrics import confusion_matrix 

from sklearn.metrics import precision_recall_fscore_support 

import pandas as pd 

from sklearn.model_selection import train_test_split 
from sklearn.tree import DecisionTreeClassifier 
from sklearn.svm import SVC 

from sklearn.neighbors import KNeighborsClassifier 
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from sklearn.naive_bayes import GaussianNB 

df = pd.read_csv(/flowers.csv') 

X = df[list(df.columns)[:-1]] 
y = df[ 1 Flower'] 

X__train, X_test, y_train, y_test = train_test_split(X, y, random_state = 0) 

tree = DecisionTreeClassifier(max_depth = 2).fit(X_train, y_train) 

tree_predictions = tree.predict(X_test) 

print (tree. score(X__test, y_test)) 

print (confusion_matrix(y_test, tree_predictions)) 

print (precision_recall_fscore_support(y_test, tree_predictions)) 

svc = SVC(kernel = 'linear', C = 1).fit(X_train, y_train) 

svc_predictions = svc.predict(X_test) 

print (svc.score(X_test, y_test)) 

print (confusion_matrix(y_test, svc_predictions)) 

print (precision_recall_fscore_support(y„test, svc_predictions)) 

knn = KNeighborsClassifier(n_neighbors = 7).fit(X_train, y_train) 

knn_predictions = knn.predict(X_test) 

print (knn.score(X_test, y_test)) 

print (confusion_matrix(y_test, knn_predictions)) 

print (precision_recall_fscore_support(y_test, knn_predictions)) 

gnb = GaussianNB().fit(X_train, y_train) 

gnb_predictions = gnb.predict(X_test) 

print (gnb.score(X_test, y_test)) 

print (confusion_matrix(y_test, gnb_predictions)) 

print (precision_recall_fscore_support(y_test, gnb_predictions)) 


QsueJihtf®: 


0.8947368421052632 
[[15 1 0] 

[3 6 0] 

[ 0 0 13]] 

(array([0.83333333, 0.85714286, 1. ]), array([0.9375, 
0.66666667, 1. ]), array([0.88235294, 0.75, 1. ]), array([16, 9, 
13], dtype=int64)) 
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0.9736842105263158 
[[15 1 0] 

[0 9 0] 

[ 0 0 13]] 


(array([l. , 0.9, 1. ]), array([0.9375, 1. , 1. ]), array([0.96774194, 
0.94736842, 1. ]), array([16, 9, 13], dtype=int64)) 


0.9736842105263158 
[[15 1 0] 

[0 9 0] 

[ 0 0 13]] 


(array([l. , 0.9, 1. ]), array([0.9375, 1. , 1. ]), array([0.96774194, 
0.94736842, 1. ]), array([16, 9, 13], dtype=int64)) 


1.0 

[[16 0 0] 

[0 9 0] 

[ 0 0 13]] 

(array([l., 1., 1.]), array([l., 1., 1.]), array([l., 1., 1.]), array([16, 
9, 13], dtype=int64)) 


< j >/( 5 ); 5 < 5 < 5 / t <95 6urrisi-&,G$)&,ujrr6nij Lj&,rrrf}Gb s-snsn Giirrij{£G$)&)&,6$)6fT& Gl&rrsm®, ^rB^u Lj&rrrj 

<oTrF>g, GuemauSleir dip ^ssildil/lc) <oTssi aevtifldigLC) MllltinomialNB algorithm. 

iShsmsuQ^LDrrru. 


https://gist.github.com/ 

nithyadurai87/3ce9dab55025felfd41b4da48d3fcbd8 
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import pandas as pd 

from io import StringlO 

import matplotlib.pyplot as pit 

from sklearn.feature_extraction.text import TfidfVectorizer 
from sklearn.feature_selection import chi2 
import numpy as np 

from sklearn.model_selection import train_test_split 
from sklearn.feature_extraction.text import CountVectorizer 
from sklearn.feature_extraction.text import TfidfTransformer 
from sklearn.naive_bayes import MultinomialNB 

df = pd.read_csv('./Consumer_Complaints.csv', sep=',', error_bad_lines=False, 
index_col=False, dtype='Unicode 1 ) 
df = df[pd.notnull(df['Issue'])] 

fig = pit.figure(figsize=(8,6)) 

df.groupby('Product').Issue.count().plot.bar(ylim=0) 
pit.show() 

X_train, X_test, y_train, y_test = train_test_split(df['Issue'], 
df['Product'], random_state = 0) 
c = CountVectorizer() 
elf = MultinomialNB().fit 

(TfidfTransformer().fit_transform(c.fit_transform(X_train)), y_train) 

print(elf.predict(c.transform(["This company refuses to provide me 
verification and validation of debt per my right under the FDCPA. I do not 
believe this debt is mine."]))) 

tfidf = TfidfVectorizer(sublinear_tf=True, min_df=5, norm='12', 
encoding='latin-l', ngram_range=(l, 2), stop_words='english') 
features = tfidf.fit_transform(df.Issue).toarray() 
print (features) 

df['category_id'] = df['Product'].factorize()[0] 
pro_cat = df[['Product', 

' category_id']].drop_duplicates().sort_values('category_id') 
print (pro_cat) 

for i, j in sorted(dict(pro_cat.values).items()): 

indices = np.argsort(chi2(features, df.category_id == j)[0]) 
print (indices) 

feature_names = np.array(tfidf.get_feature_names())[indices] 
unigrams = [i for i in feature_names if len(i.split(' ')) == 1] 
bigrams = [i for i in feature_names if len(i.split(' ')) == 2] 
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print(">",i) 

print("unigrams:",'.join(unigrams[:5])) 
print("bigrams.join(bigrams[:5])) 


$}3)JD(3) (Lp&j®L))GV epsijQGUrTQTj pFOdllCt -GSI d(LpLD <oT&,3,6$)6$I /_/<S/7/7<SSY7' Uu51jnSd(^d 

QarTQd&LjuL-QsrTmGST gtsst <s^0 si/iso/tl/llo QpevL o Gus^irjrB^j urnjd&ijuQQfD&j. 



iSlGSTGSTlj <3]G$)6U 7 0"30 OTCT//-0 G)S]S^^^lGST U Lp UuSljDS Q<95/7(/J)<®<S/jz_//_!_(5) 
G&rrfBidgiLju® Sling)]. 


TfidfVectorizer ^Lpevib L]3yrrifl<sb ^sbsn S>]ss)ssi^^lib 

features -^&>Q&L£)d3>uu(i)\S<dirD6$i. /_5)«jjw/rChi2 opsoib spsijGhsurTQj) 3,<sis?}d,&,<sis?} 
category -(Siurripub Q^rrsmSsnsn sumj^ssi^sdissr ULbipiusb 

Q&L£\b 3 ,LjuQ)S(D§ f ]. iShsir&STrf ^/«nsi/ ^ssb^^ssf] surrij^Gsi^ujrrs, ^GmurB^rrsv <0775.5 
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category -sir dip ^ssiupiLjLb, <g£\pjGmippj6mi_rr3, ^ssiLD^^rrsv sripg, Category -sir 
dip ^ssildil/lI) (otsstu^i unigrams, bigrams srspnii Quiufisv 

Q&L£\&G,Lju($)QG$T(D6$T. . 


17.1 Vectors 


classification problem stssiuqi '^lL' ^svsv^j '^sv&nsv' srspiu) ld^IuiSIsst dip 

aismsfluiShs^m rffaiLp^^LD stsst sjjd&sktGsu giemGi—md. £ g ) s 5 ) si / <pp ssiroG uj 1 ^svsvgjO- 
^,sv (^/^danjutpud. rsmb d)sv&LDUJLb GurrdQiudj&ssisnGujrr, rffjprbui^rdaisiDsnGujrT, 
epsfiliurhjgi&nmGujrT 2 _ 6 rT 6 i?i_rTg;d QarrQfbg] uuSlpdl ^sdids, GsusmipuSlq^d^Lb. 
^^Gurrm/D ^i—rhi&srrlsv ^G\jjn<sis)ir)Quj6VGvrTLti l'S & 0'S -^<95 LDrrinrusu^irx^ 
skleam GULprhi(&)dls$rjD usvGsurgj sussia^iunssr Ghsudi—rjasrr urorflu-ii-b ^supp^ssr 
u uj sir urr (plash u/or^iLjid iSlshsuQpLDrrru arTsmsvmd. 


usbGsurp] surrdQiutdassisnij Quif)r^(rpd(^Lt) s^qp QgjrrqpuLi corpus srmuu(p)dlp^]. 
^][5&> COrpUS-Sl) 2-GYTGYT ^ S5)S5Td>S5)a>lLI LL> O's & 1' 's <^<95 LDIT/biriJ61Jg>JD(&) 

dictvectorizer() , countvectorizerQ ^Siiues)su uiushuCpidissTpsm. 


dipdasmi— 2 _^rrpjsm^^lsv Corpus 1 LD/o^jid COrpus2 srspiid G^rrsmCh) Corpus 
QarT(^idauuLL(^ishsnssi. (Lp^sSlsv ^shsrr^j dictvectorizerQ -i@ ^^rTrjsmLDnas^Lb, 

^jfjsstsTL _ nsu3>na ^sytstt^j COlintvectorizer() -<*0 ^^rrrjsmLDrTas^Lb 

^ssiLDrB^jsnsYT^j. ^(pi^^na vector otct/lq variable-si?. COrpus2-<a> zshsn 

SI//7<95®UJ/Ey<95(g5<95<S/7«J7 encode Q&UJUJUUlGL- Qsudl_[jash ^SSlLDrB^JSYT&rrSIfr. ^GU/bem/D 

s&Ksud)^] fpmb ^rjsm® Gi<sudi_rf^(^ddiss)i_Gujujnsifr euclidean distance-®*) 

srshsurr^j asmtpliSlLpuu^] stsstpq] umjdasvrrLb. 

https ://gist. github. com/nithvadurai8 7 / 

f3fff58ab72 722 79ef069689fc39 Idee 
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from sklearn.feature_extraction import DictVectorizer 
from sklearn.feature_extraction.text import CountVectorizer 
from sklearn.feature_extraction.text import TfidfVectorizer 
from sklearn.feature_extraction.text import HashingVectorizer 
from sklearn.metrics.pairwise import euclidean_distances 


corpusl = [{'Gender': 'MaleGender': 'Female'},{'Gender': 'Transgender'}, 
{'Gender': 'Male'},{'Gender': 'Female'}] 

corpus2 = ['Bird is a Peacock Bird','Peacock dances very well','It eats 
variety of seeds','Cumin seed was eaten by it once'] 

vectors = [[2, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0],[0, 0, 0, 1, 0, 
0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, l], 

[0, 0, 0, 0, 0, 1, 0, 1, 1, 0, 0, 0, 1, 1, 0, 0, 0], [0, 1, 1, 0, 1, 0, 0, 1, 
0, 1, 0, 1, 0, 0, 0, 1, 0]] 

# one-hot encoding 

vl = DictVectorizer() 

print (vl.fit_transform(corpusl).toarray()) 
print (vl.vocabulary_) 

# bag-of-words (term frequencies, binary frequencies) 
v2 = CountVectorizer() 

print (v2.fit_transform(corpus2).todense()) 
print (v2.vocabulary_) 

print (TfidfVectorizer().fit_transform(corpus2).todense()) 

print (HashingVectorizer(n_features=6).transform(corpus2).todense()) 

print (euclidean_distances([vectors[0]],[vectors[1]])) 
print (euclidean_distances([vectors[0]],[vectors[2]])) 
print (euclidean_distances([vectors[0]],[vectors[3]])) 


1 . dictvectorizer() -90 categorical variable-® l's & 0 's iditidjd 
2-3)6HLb. g$)iii(§ 'Gender' grop/io categorical variable -sir LD&iuurr& 'Male'. 
'Female', 'Transgender' ^Situemeu ^ss)L£>rB§]eheiTesr. (Lp^sSisc 
unigue ldGbIuli&gsxsit esxsu^^j (S£0 dictionary-®:) ^(r^sund(^Lb. iShsirsisTrf^jrB^ 3 
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^ssfl^ssf] GurnjiE>G$)&)g,(§rr,Lb, ^sujds^jdu Qujdtu si5)sy7/5/0/_d 5 surfl^i^id 5*3 
dimension Qarrsmi— 6^0 matrix-^/,* 2_0 si//t*<95/_}/_/(5)lo. spsirQsunQij 

sufiiL/Li) <3jr53, matrix-<SOT <s ^0 row ^*61/ id, <3i[5d, GurfluSlev dictionary-si) s_errerr 
GurrrjjBsing, ^i—idQujnjflQijuiSlsiT 1 srmsyid, ^jsvssxsvQiussflsv 0 srmsijLd (3/_//tl_(/J) 
ss)Su^^idGl3,rrsrT^rr,id. £g)si/si//r(3/D s ^0 Qsi 9 */_/ 7 ^ 0 si 9 /r**/_v/_/( 5 )®/D 0 /. £g) 0 /( 3 si/One- 
hot encoding <oTsmuu(^)SirD^j. 


print (vl.fit_transform(corpusl).toarray()) 
[[ 0 . 1 . 0 .] 

[ 1 . 0 . 0 .] 

[ 0 . 0 . 1 .] 

[ 0 . 1 . 0 .] 

[ 1 . 0 . 0 .]] 


fit_transform() <oTsiru^j [bld^j corpus-g? 2_syrsyf/_/r<95 (orQdjStfdQ&rrsm® 

Qsi/<si_0<s0<s a/nirydQarTQdi&fLb. tO_dense() <orsiru^j sunij(BS$)(B&>sdlsiT 
syi-ijd)G>)d3>n6$iQ6i]di_6$)ij2-(rT)6i]nd(&)Lb. _VOCabulary sisiru^j [bld^j Q<sudi_rj 
2_0si//7<®<95<5£5)/d0 s_<5s 'SIuj dictionary-^** Q^nsmis^o^d^id. 


print (vl.vocabulary_) 

{'Gender=Male 1 : 1, 'Gender=Female': 0, 'Gender=Transgender ' : 2} 


2. COUntvectorizerO - Q&nQdauuLdL- sundSliudia^sn ^ssissr^ssi^iLiid l'S & 
0'S -^/,* Lcm/orrum. [bld^] 2 _^rrijsm^^l 6 V 4 si iffl&iQtyid, 17 ^ssfl^^jsu sumj^ss)^^ff>id 
^snsnssr. (oTssiQsu 4*17 dimension Q*/rs8k/_ matrix ^0si7/r**/j/JL@sy7W0/. sir 
spGuQsurrqr) suifluSlsviid 6 rrBG)g>rB 3 , surrrjjBsing, ^ji—idQu/DrushenQ^rr^^] 1 simsyid, 
^i—LbQu/DrTd, surrtj^ssi^O (ojssis^id <3is&)LD[B§>l(fTjuus&)d,d s>nsmsonid. £g)0/(3si/ bag of 

words (oTsstuuQ)I dlfD&j. 
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surrrf^sis)^3isrT ^ss)LD[B^l(r^d(^Lb sufissi^uSlsb^nsir QYlCOdiQ QgiLnuuuLLi—Gsxsi] 

£g)z_z_o Qu/bj^d^d^ib srmd g^jn (LpL^iurr^]. surrrf^s^^gisrrlsv 2 _<sy 7 w toT&vsvrr 
<oT(Lgd,§]dg,6$)GfnLiLb, d)p$hu <oT(Lg^^idg^snrrg, LDrTjbfflisfilLd® ^^ssxsm tokens-^* 

LDninrruLb. Tokenization <oTssru^i ^jusm(dd(^Lb (SldjdljlLl _ <oT(Lg^^idg,ss)snu 

Qujon&QTjd^LC) 6unijd,6$){f,g,6$)GfT ^)«3>/_Qsi/syfl ssigu^^ju tokens-^/,* 

UDfTfflffysi/OjS ^,0/i). Tokens <oTssru^i Qg>nui3ss)isb (g) i_ld Qujb^jsnsn 6umj(£6$)g,g,6rT 


Bird is a Peacock Bird 1 ,'Peacock dances very well 1 ,'It eats variety 
of seeds','Cumin seed was eaten by it once' 



£g)<5OTxs37 binary frequency ld/d^ld term frequency <orm^]Lb g^pem® 
&Sl^d]g>sdleb (g^ifluiSlisvnLb. binary (oTssru^i GlsurryLb l'S & 0'S -go LDidQid 
QsusrrlLju(^^^]Lb. term (oTgstu&j s^suGlsurrq^ 6i//7/j;5<o5)<g{L//_o ^t^^ssxsst (Lpss)ro 

^jl^LbQujDruerreYT^J <oT 65 TU 6 $)g, QsU&duU^^^JLD. $jdl(d) Bird (oTSSIU^l (Lpa,60 
surrddhu^^lsv £g) rjsm® (y^ss)ro 2 _snsn^rrsb < 3 ][ 5 g, 2 <o7w 

G)susdhju(d)^^LjuLd(d)snsnss)^d g^rrsmsyLb. 

^)^ff5<g5f7<o37 vocabulary-si) C 9 // 5.5 si/^ u5)si5]0/5^//_d <on5)<g5<g5/j/j/_!_i_ 17 ^sisfl^^su 

si/f7/j^(o5)^<g5ffiT c 9/(o!f)/_D/5^0/_y/j«n<5<g5 g>rrsms^ib (0 (Lpg,<sv 16 GU (off) tj). g)/5j@Bird, 

Peacock, it ^dhu <^riij{B6$)g,g>6rT ^rrcsmd (LP^JD £g) i—LbQujnrrysrriSTTg y. ^ssmeb spQrj 
s^0 (Lpss)(D Lmd($)Lb gjirsir £g)/5V0 Qg : LS\bg,ijuLb(d)6rT6ng ) i. ^sbsugQjD Case-sensitive 
^jsvsvgLDev it. It ^diu ^rrsmdib s^ssrrorrg, ^T(d)^§tidQg,rrsnsmjuLb<d<dTSfT§tl- (Smsyiib 

a <5Tsiru s^0 £5<ssfl GurrrjjBsingjUjrTg; ^T<d^^idQg,rrsnsmjui_S)S]sbss)(M. 
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3. TfidfVectorizerO - Term, frequency ^Lpevib ^q^sund^uu^ib Qsudisinrj 
normalize Q&uj&ti <3]b)3, frequency -<95<sfr<oOTweight-®r> QsusifiuuC^^^iLb. 
Qsi/^yzi) raw count-^/,* 2 <orm Qsusduu(^i^^rTLD&), normalize Q^uj^j 

Qsij^uuQ^&iisijGg, L2 Normalization (level2) <or&nuu(^)Lb. 


print (TfidfVectorizer().fit_transform(corpus2).todense()) 


[[0.84352956 

0. 

0. 

0. 

0. 

0 

0.42176478 

0. 

0. 

0. 

0.3325242 

0 

0. 

0. 

0. 

0. 

0. 

] 

[0. 

0. 

0. 

0.52547275 0. 

0 

0. 

0. 

0. 

0. 

0.41428875 

0 


0. 0. 0.52547275 0. 0.52547275] 

[0. 0. 0. 0. 0. 0.46516193 

0. 0.36673901 0.46516193 0. 0. 0. 

0.46516193 0.46516193 0. 0. 0. ] 

[0. 0.38861429 0.38861429 0. 0.38861429 0. 

0. 0.30638797 0. 0.38861429 0. 0.38861429 

0. 0. 0. 0.38861429 0. ]] 


4. HashingVectorizerQ - ^&rjrr$ud6m^6&>6m ^svsvrrLDGsvGuj GfBtji^ujrrs, 
Qsudisinrj 2_06i/(7<®(g/_D.. (3z_o/r5<95<o5k/_ diet & COUnt ^dhu ^jijsm^iLb'Z ULq.S>sdleb 
Gsussiev Q^iuiL/Lb. (Lp^eSleb Ghsudi_rj ^q^sunds>^^ljb(^^ G^ssxsuiurrssi dictionary- 

ss)iu si/ nd(&)Lb. ^Q]^^ULq.ujrr3,^^rrssT Qsudi^ssxj ^q^sund^Lb. ^§>I6V (Lpg,sv 

ULq.6$)Ujb, GrBrji^iurrai Qsudisftij ^q^sund(^suss)^^nssi Hashing Trick 

aoTssiGunib. <ojQssissf]eb dictionary- sir msy Quq^&u Quqrjg; ^rB^snsi^d^u Qurfhu 
<3]&,[jrrd)lG$)UJ G&LS\d&,£ G^ss)smurrssr memory-OTr^syra/io ^^laifld^ib. £g)«o<5<5 
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■5(Sl5)///_}/_/<5/D<95/735 61//5<5(3<g 67/SI/ SS) <95 UJITSST Glsudl^fj . 


print (HashingVectorizer(n_features=6).transform(corpus2).todense()) 

[[ 0. -0.70710678 -0.70710678 0. 0. 0. ] 

[0. 0. -0.81649658 -0.40824829 0.40824829 0. ] 

[ 0.75592895 0. -0.37796447 0. -0.37796447 -0.37796447] 

[ 0.25819889 0.77459667 0. -0.51639778 0. 0.25819889]] 


5. euclideandistances - encode Q&ujiuuuLdi_ 

<su nddhu djsn^r^ddhoiD l _ (Siuiunsur (SsurriJurT® STrsg, ^snsi^d(^ ^snsn^j srsirusm^d 

<95 smd ®/_ s_<5siy/_D. G/_D/r5<95<o3k/_ 2_g,rTrj6mJ5§>l6V (Lpa,sv ^usm® surrddhudjan^d^ 
^smi—Ghuiurrm (Ssuruurr® &inrriJ (g«o/D6i//7<956i//_o, (Lp^evid^Lt) 3 -su^j 
<surrddhu^^]d(^LDrrm GsurriJurT® &(Diq] ^^lanDrrans^Lb, (Lp&svid^Lb 4-si/^y 
<surrddhu^^]d(^LDrrm Qshirruurr® ^)ssrss)iLb ^inru ^^a/D/rasi/io £g) <75/_}/_/«»<5 <95 
airrsmsvmd. 


print (euclidean_distances([vectors[0]],[vectors[1]])) 
[[2.82842712]] 

print (euclidean_distances([vectors[0]],[vectors[2]])) 
[[3.31662479]] 

print (euclidean_distances([vectors[0]],[vectors[3]])) 
[[3.60555128]] 
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17.2 Natural Language Toolkit 


^gjsusmij pmS 3,smL_ Qsudi_rj g-QpsurrdsiLb ^sfornddisond sjQ&sniLb sprfrjsmQf 
surr[j{£6$)'9)3,6rT ldl-®G>ld Gfii—LpQutnifilm[F>&>rnsviih lld Qu/orra, 

6umj{£G$)g)S,(<on)&&,rr65T 0'S <3]3j QsrrrsmLgQ^d^Lb. ^j^mrrsv <jy/ 5 <g Qsudi_(rj)Sisu_uj 

<jy<siT6i/ ^^Isiflddljo^i. ^j^iQurrsirjD <3]S)l3, ^snsfilsvrrssT 0'S -®r>zj Qurngu srSlsudj^id 

Qsu«l/ 7 ^/ 7 <sot sparse vector srsirgg ^ss)Lpd3uu(^lQrD^i. 2-3 } rnj6m{£g ) i&(& ) <s ^0 

Qg>rrLiiSl6$)i6rT ^rrffhusv. ffissflum, srSlsnsnujmd® Qu^sstjd usvQsugg 3jss)rD3,^n,d3,3ssr 
surrddhurhJSisrT ^shsnQ^ssfsv, < 3 ]surr)S$)(DGluj 6 VsvrrLb <s £0 Glsudi—grrs; wnroQiid Qurr&j 
^ 3 dlusvids,rr&sr surfuSlsv 5lssfLD3si^d3i3&ST su3rj{£s$)3) ^i_LCiQujDf6l(r])d3;33i], ^Q^Qurrsb 
5\ssfLD3si^d33ssr surfluSlsv sfilsmsnujrTLS(jfdg;rTm surrrj{£s$)&) £g)z_zi)Q l//d/)5) 0.95 < 95 / 70 /. 
^jQ^Qurrsv urrS^^rrsv spsuGlsurrqj) sufuSlsviid Gfb&nsuuSlsvGvrrg, usv CTs 
rffss)fD[B^l(r^d(^Lb. ^j^mrrsv2 (Lpddhuu LSlrjd3SS)SST3,sh STrggdlssrrDssr. 


(xp^su/ 7 si/< 5 / 7 <g 5 ^ 9 /^)<g 5 ^/siTsiy memory & Space sdsm3Qg)3j. Numpy srsmud)] O’s 

^svsvrr^sufD&Dir) (^rSluLSl^su^ro^rrs, spqjj&sv dlrDUL/ sus$)3, 3,rjsy sus$)3,3,s$)sn 

su Lfirfi (3)5,1 sir jo str. ^(^ 3 , 3 , 3 , 33 , dimensionality-^ ^syreiy ^iflarfda ^iflarfda <3irf>3, 
^snsyd^u uuSlrnd) ^<syf)< 95<950 Q^ss)suuj3ssr ^rjsi^3,sfssT srsmsmfdss)3U^Lb 

^l^lsirfddlfD^j. ^jsvssrsvGlujssflsv OVerflt ^ ) su3,rD3>3s$r ^urrujLb ^srrsrr^j. ^j^jQsu 
'curse of dimensionality 1 ^svsvgi 'Hughes effect' sTssrrDss)Lpd3,uuQdiingtj. 
£g)«r><g srsusurrg y (^su/duu^j stssiu Qu3tsuQ 3 , dimensionality reduction ^, 0 ld . 


n bld3 y Qsudi_rj 2_(3)GU3d3>d,§ils$iGu33)i Stop_WOrds= 'english' srssrd 
G)3>rT(j))d)(]>3)iTLCiiTS$TiTsv is,was,are Qu3sstjd ^rfjdlsv^^lsv su 0 dlssrjn 3jss)smd 
Q3rrrD3>s$)sn STSvsvmd iS^fLpshsn Q^/t/d <95(60 <950 ldlS^ld dictionary 

^_0si//7<s<g5/j/j(5)i£. ^j3,ssi3&) zSr]3>sir dimensionality (^surndirD^j. 

<3]SUSUirQ(D NLTK STSS)ILb <950615) S^SSTgH 2_shs33J. 39]dlGVISYTSYT StemmeU 
lemmatizer ^dludsu/ossr/DU uujs$ru(j)\d,3)isu3,s$i ^psvLb Qsudi_rfsir 
dimensionality (g)ga7gn/fi) (^ssifod^uui^lsusu^d arrsmsvrTLD. 
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https://aist.github.com/ 

nithvadurai87/491e5e6f9c009ebd88912e71ef9363a4 


ii ii ii 

import nltk 
nltk.download() 

II II II 

from sklearn.feature_extraction.text import CountVectorizer 

from nltk import word_tokenize 

from nltk.stem import PorterStemmer 

from nltk.stem.wordnet import WordNetLemmatizer 

from nltk import pos_tag 

def lemmatize(token, tag): 

if tag[0].lower() in ['n 1 , 1 v 1 ]: 

return WordNetLemmatizer().lemmatize(token, tag[0].lower()) 
return token 

corpus = ['Bird is a Peacock BirdPeacock dances very well','It eats variety 
of seedsCumin seed was eaten by it once'] 

print (CountVectorizer().fit_transform(corpus).todense()) 

print (CountVectorizer(stop_words='english').fit_transform(corpus).todense()) 
print (PorterStemmer().stem('seeds')) 

print (WordNetLemmatizer().lemmatize('gathering', 'v')) 
print (WordNetLemmatizer().lemmatize('gathering', 'n')) 

s_lines=[] 

for document in corpus: 
s_words=[] 

for token in word_tokenize(document): 

s_words.append(PorterStemmer().stem(token)) 
s_lines.append(s_words) 
print ('Stemmed:',s_lines) 

tagged_corpus=[] 

for document in corpus: 

tagged_corpus.append(pos_tag(word_tokenize(document))) 
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l_lines=[] 

for document in tagged_corpus: 
l_words=[] 

for token, tag in document: 

l_words.append(lemmatize(token, tag)) 
l_lines.append(l_words) 
print ('Lemmatized:',l_lines) 


^ssxssr iSlsirsuQ^LDrTru Q&iugjj uiusiru(^^svrTLb. 


import nltk 
nltk.download() 
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NLTK Downloader 


File View Sort Help 


Collections 

Identifier 

Name 

Size 

Status 

all 

All packages 

n/a 

partial 

all-corpora 

All the corpora 

n/a 

partial 

all-nltk 

All packages available on nltk_data gh-pages branch 

n/a 

partial 

book 

Everything used in the NLTK Book 

n/a 

partial 

popular 

Popular packages 

n/a 

partial 

tests 

Packages for running tests 

n/a 

partial 

third-party 

Third-party data packages 

n/a 

not installed 


Download Refresh 


Server Index: https : //raw, githubusercontent. com/nltk/nltk_data/gh-pages/j 


Download Directory: /home/shrini/nltk_data 



'Bird is a Peacock Bird','Peacock dances very well','It eats variety 
of seeds', 

'Cumin seed was eaten by it once 1 

1. Gw/oasmi— Gund&hurhiafQnjdaneisr CountVectorizerQ sisiru^j iSlsirGUQTjLDniQ] 
(S£0 £)61/<95/_65)/7 2_061//T<950/_O (4*17). 


print (CountVectorizer().fit_transform(corpus).todense()) 
[[2 000001000100000 0 ] 
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[0 001000000100010 1 ] 
[0 000010110001100 0 ] 
[0 110100101010001 0 ]] 


QjLDjn&smL- surrdShurhi3i(^d(^ Stop_WOrds='english' (oTssrd QarrQjBd)] 

Qsudi^rj 2 -(r^surrd(^LbQurT^], 

is, very, well, it, of, was, by, once ^dm suirrj^s^^eh fEd^uu^isu^iTsv 
dimensionality 

(&)6$)[DiT)3)l <^(r])UU6$){f,d ^rrsmsonid (4*9). 


print (CountVectorizer(stop_words= 1 english').fit_transform(corpus).todense()) 
[[2 0 0 0 0 1 0 0 0 ] 

[0 0100100 0 ] 

[000010011] 

[010100100]] 


2. stop_words='english' uiussiudddssrnspiLb _seeds, seed ^duj 

£g)/7<sw(/|)/i> ^jusmd si/fr/r^OT) <5<SOT7T<95 (S^LSlds.uuddssTfDssr. £g)«o<5<9| 

g,<sfilrjuug,iBg;n3; surB^CS^ PorterStemmer() ^/,(gz_D. £g)^/ sptiTj ^thidsvd Q&nsb&Slsir 

GsiJijQ&rTevsinsv <3is$)&> LDL_($)Lb Gi&Lfld^Lb. suq^dssifD 

^jssrssr iS)jd Gl^rrjDdis^snGUuevsvrTLt) (3s/j5)ss/7gi7. 


print (PorterStemmer().stem('seeds')) 
seed 
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3. WordNetLemmatizerO <ordru^j ^diSisvd Q&nebesxso sy&dr 

Glurrq^GnpSlrB&j Q^LSld^id. < 3 ]a,rreuspGrj spqij Gl&rrsv spifli—^^lsv 

QuiurfdQ^rTGVGvrrais^Lt) LDjDQjDrrqij ^i—^tslsv sfihsmsmQ&rrsvsvrT&GiiLd 

UUJG$TU($){£g)LjuL-Lq-([})Ljl3G$T <3]G$)61J £g) fJSmSS) /_ IL/LD <5<osf <5<5<osf Q&rrfD&GtTrr&d 

^rnjsm^d^ 'I am gathering foods for birds', 'seeds are 
stored in the gathering place 1 (ordru&d gathering, gather ^dru^j 

)ijsm (p &>g$?}£&)G 5?} Gurrfj£G$)&>g,6nrrg, Q^Ldldaaju^ud. 


print (WordNetLemmatizer().lemmatize('gathering', 'v')) 
gather 

print (WordNetLemmatizer().lemmatize('gathering', 'n')) 
gathering 


4. ^LD(zp« o/_uj C0rpilS-g> NLTK Q&rr&mCh) ^ smi ^idQurra ,!. iSJdT<su<r^LDrrru 

GlsuerHuu^^^Lb. 


print ('Stemmed:',s_lines) 

Stemmed: [['bird', 'is', 'a', 'peacock', 'bird'], ['peacock', 'danc', 'veri', 
' w 

ell'], ['It', 'eat', 'varieti', 'of', 'seed'], ['cumin', 'seed', 'wa', 

'eaten', 

'by', 'it', 'one']] 

print ('Lemmatized:',l_lines) 

Lemmatized: [['Bird', 'be', 'a', 'Peacock', 'Bird'], ['Peacock', 'dance', 
'very', 'well'], 

['It', 'eat', 'variety', 'of', 'seed'], ['Cumin', 'seed', 'be', 'eat', 'by', 

'it', 'once']] 
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18. Decision Trees & Random Forest 


Regression ld/d^ld Classification ^tjemu^fD^Ld ^^Gudg^ l^uj GptjG&rr® 
(Lpem/Difiisv iSifiidai £g) ujGvrra, non-linear ^/rsiy.s( 5 y 5 <s<g 5 / 7 <o!J 7 model-^/ ) <g 5 decision 
trees ldib^ild random forest q5)syD ecision trees gtgstu^j 

Qurr&jGurrg, LDrr^lifl^ ^/jsi^^srflsv siren LD^luLi3,ss)snd Q&rrGmQ) <3 ]gu(dg$)(d d\Qjd\gn 

u(^^l 3 ienrT 3 iLj iSlfl^^id giindlfD^]. d\-pdg,GmL_ GT(E)^^idgimdL^ 6 V s ^0 rnsvij LDSvsShurr, 

GrjrreDmsurT, g,rTLDGV>rjujn GTGsrrn $ErjLDrrG$f)dg; DecisionTreeClaSSifier() ldjdiq]ld 
RandomForestClaSSifierO uujGdu(^i^^uuid(^iGnGnGsi. spsijQsumr^ LDGvflGsr 
Lfi <95(SY 5 «f)i_tu(Sepal) rEerr^3isv(LpLC), ^Gu/Dfflsir (Eld/dli/d £g);5£p<95(SWjS5)/_uj(petal) 
[Em ^gi<sv(LpLDrrm 4 ^Lb^djs.Qsn 6^0 LDevrf <oT[B^ uysvrjrrg; ^(njd^id <oTG$tu «n< 5<5 
^/TLD/7<ssf)<s®/£)^y. (g)fB<s ^iLb&rEjgnsrrlGVimm &nGH&,G$)Gn u<5vQ<suir)i u@£ 51 amrr&u iSlfl^^jd 

■as_® 0 /-D Qgu6$)<svg$)uj DecisionTreeClassifier() Q^iudiro^j. 


sysi]sun gy &,rj6ng>6$)GfTU iSlfluu^j gigstu^j Gpq^d)®) COnditionS-®?/j Qurr^j^^j 
[Bi^ddljD^j. <oT<o37(3isi/^/7<oOT ^)<S5)si/ Eager learners <oTS577ry ^^>{pdg>uu(^iS}Gsi[DGSi. 
^\&)(D(3) LDfTfDpng, KNN ctsotl/^/ lazy learners ^(^ld. Ensemble learning 
<5jG5)i ld (jjj&nroudsb random forest Ensemble GiGsr/orrev (g^o-pund gtgsi ^y 

Q/J/70SY7'. ^>/<5/7si/0/L/si)(3si/ffy decision trees-®? ^(rpGurrddl, ^gujdg&ijd @( LpLDLDrrg; 

S5)S100/<95 gifoSlfD^]. (d)(LQL£id)§)l6V 2_6Y7W G^GuQ<SUITQf) tree~/_0 Q GU GU Q GU g)J Ulldfodl^ 

dj[jGijg>G5)GYT <oT(^id>^id Qg,rrGm($i uuSlrodl Quro^jd QgirrGrrdljD^i. gtgstQgu (^i&gv)Igis)i_uj 
accuracy (g)ss7gn//i> ^udign-Drrg; (s^cthd^id . dLpdg,Gmi_ GTQtE&jd&mdisi-Gb 
^)s5)si/<95(SY5<a<95/7ss7 rffijG^Gvd girTGmsvmdi. Decision Trees 89% accuracy -®giLjid, 
Random forest 97% accuracy -®?ti?z_o GlGUGirrluu(E)^^iGUGD^d g;rrGm6vmd. 

Qm&^jid G^GuGlGurrmrojid gtgugu njgj g,rjGyg ,«n gytu iS)ifl^^]d g,jDdlfo^] GTGiru^] 
gu G5)[ju l _ ld it g> gl/ lE> g;mdi_uuid^GUGn^j. 

https ://gist. github. com/nithvadurai8 7 / 

d21ffb25b7f5a38d90a437e9f!69d58e 
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from sklearn.datasets import loacLiris 
import pandas as pd 
import os 

from sklearn.tree import DecisionTreeClassifier,export_graphviz 
from sklearn.metrics import 

confusion_matrix,accuracy_score,classification_report 
from io import StringlO 
import pydotplus 

from sklearn.model_selection import train_test_split 
from sklearn.ensemble import RandomForestClassifier 
from IPython.display import Image 
import matplotlib.pyplot as pit 
import seaborn as sns 

df = pd.read_csv(/flowers.csv') 

X = df[list(df.columns)[:-1]] 
y = df['Flower'] 

X_train, X_test, y_train, y_test = train_test_split(X, y, random_state = 0) 

a = DecisionTreeClassifier(criterion = "entropy", random_state = 

100,max_depth=3, min_samples_leaf=5) # gini 

a. fit(X_train, y_train) 
y_pred = a.predict(X_test) 

print("Confusion Matrix: ", confusion_matrix(y_test, y_pred)) 
print ("Accuracy : ", accuracy_score(y_test,y_pred)*100) 
print("Report : ", classification_report(y_test, y_pred)) 

dot_data = StringIO() 

export_graphviz(a, out_file=dot_data,filled=True, 
rounded=True,special_characters=True) 

graph = pydotplus.graph_from_dot_data(dot_data.getvalue()) 

Image(graph.create_png()) 

graph.write_png("decisiontree.png") 

b = RandomForestClassifier(max_depth = None, n_estimators=100) 

b. fit(X_train,y_train) 
y_pred = b.predict(X_test) 

print("Confusion Matrix: ", confusion_matrix(y_test, y_pred)) 
print ("Accuracy : ", accuracy_score(y_test,y_pred)*100) 
print("Report : ", classification_report(y_test, y_pred)) 

export_graphviz(b.estimators_[5], out_file= 1 tree.dot', feature_names = 
X_train.columns.tolist(), 
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class_names = ['Lotus', 'Jasmin', 'Rose'], 

rounded = True, proportion = False, precision = 2, filled = 

True) 

os.system ("dot -Tpng tree.dot -o randomforest.png -Gdpi=600") 

Image(filename = 'randomforest.png') 
f = 

pd.Series(b.feature_importances_,index=X_train.columns.tolist()).sort_values(a 

scending=False) 

sns.barplot(x=f, y=f.index) 

pit.xlabel('Feature Importance Score') 

pit.ylabel('Features') 

pit.legend() 

pit.show() 


rffij spidamssr aSlmdsuh: 

flowers.CSV gtssiild GarruiSlsv GlLDrr^^Lb 150 &,rj6ij&>6n uifilin&dig) ^snsnssr. ^ss><su 
train_test_split( ) gtssiild fLpGsmuuuj. 112 g,rj<siia<sh uuSl/n&di&fLb, l 8§)1 38 
3,rj6ija>6n uuSIjdHf) /_)/_//_!_/_ model-®£> Q^rr^luu^jD^Lb 
ulUGftuQdj&jUUL-QsnsnGsi. dLpd&smL- decision tree-<s 57 (Lpg,<sv node-<s 0 isyf 2 _<sy 7 w 
samples^ 112 gtgstu^i Gh-DrT^Lb uu51fn@)d(&) ^GifldauuiJ-QishGrT^tjsi^^ssxsnd 
(&)ifi}d&}[D 3 )i. value = [34,41,37] gtgsiu^] 34 ^rjsijaish wsveSlesiadigLi) 41 
fspsyaim gjrrLDsmijd^Lti, 37 a,rrsy3ish QrrrTgomsydf^Lb ^>iss)LDd^,isnsnssT GreviLb 
6 fi)Gurid,G$)&,d G) 3 ,rrQ]dSitD^]. entropy = 1.581 LDiT^irfi^isiifisv uncertainty / 

disorder / impurity-gxs (^f^lddl/D^j. prun 6 i/«f><s/j/_/(/J)< 5<5 Gsusmuj-iu 

usvQsurtJ iSliflsi^aierrlsv 2 _ehm Grrsa, aGvrsg^mmm 

<oT6$TU6$)&)d g^rQjLb. &6mdd® iSlmsiJQijLL q£®s)/du5)<sv rflg;(LpLti. (Lp^&Slsv 

Ql orr^g, g,rjGij&<5)rrl&v G^eijQGurTQ^ iSlrflsrxsud G&rfrBg, ^rje^ai^Lt) gtsugugug^ 

<oT6m<ss$f}d<s&>aii5]<sv s^gugugst gtg^ild lSIgstsstiL &Gmd!dU_Lju(j)iLti. iSlGSTGmrf ^ldld^I/_)/_/ d 
lop base 2 aGmQiSlLZj-daGGiJGmQLb. ^g,rDanGKr s>(^&S\ 

https://www.miniwebtool.com/log-base-2-calculator/ gt^jld 

<suGir>Gvd>g,Gnd>§?iGV ^gytgyt^]. ^GUGurrGin ldgvgSI, Qrjrr^rr, &rrLD6$)n gigstg^ild G^GuGlGurrem 
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iSlrflsyd^Lt) ^ssf]^^ssfhurr3,d &6m(Ji)iS)Lq-d& GsiJsmQLb. a,G$)i_!dlujrr&, ^\ss)su3,sdissr 
d^L_($){£Q{5rrG$)g,G$)UJ - &T&s)iil> (ST^lijLDS^jD (^j^hurrev Quq^ddlmrrsv Soilul/G^ 

entropy ^(s,lL. 


Entropy = - {Summation of (fraction of each class.log base 2 of 
that fraction)} 

= -{ (34/112).log2(34/112) + (41/112).log2(41/112) + 
(37/112).log2(37/112) } 

= -{ (0.3035).log2(0.3035) + (0.3661).log2(0.3661) + 
(0.3303).log2(0.3303) } 

= -{ (0.3035).(-1.7202) + (0.3661).(-1.4496) + (0.3303).(- 
1.5981) } 

= -{ -0.5220 + -0.5307 + -0.5278 } 

= -{ -1.5805} 

= 1.581 
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3ismdSu_uu l_l_ entropy ld^iussiuQuj Gus$)ijLjui_d,§)i6$i (Lpg,<sv node-si) 

< 5 nsssrsvmb . ^LbLD$ 5 luLi 0-d(&) sytBl&Lnrrg; ^(TijLju3,rT6V, 112 ^rjsi^an^LD 6^0 
condition (ipSl)Z_o37 , 75 <oT(g )//-0 <oTSm&SsfJdsS)3iu51sV ^ISSILDIL/LC) (g )0 lSllfl<S^3j<Snn3iLj 
iSlrddauuQdlsirjDSKr. ^^rrsu^j X2 <orssruu(^iLb Petal length s^id&djidlssi 
LD^IULI3i<SrflsV 2.35 -<S0 &L£ £g)0 [5^rTSV <3ld,&,65)3,UJ ^[JS^3ish ^l—ULIJD TlOdiQ-SVILD. 

^sytsttssisu susvulijd nod e-s^jid iSlrfldaLjuQdlssTjDSKr. iSlsirsKrrj LSsm^ud 
iSlrfld&uuLdL- £g )0 iS)fl<s^3ii^d(^Lb entropy &6md!dU_Lju(b)!d]rD3)]. £g)/_/j/_//DZ_o ssnsn 
node-si) entropy 0.0 st&st su rB^iehm^i. ^giGsu decision node sTszTuuQLb. 

^{Brrsu&j 0 ^<vjd(&)Lt) uid^^lsb ^{ 51 si) ssnsn < 5 / 7617 . 95 err ^GSxssTfB&jid sjQ^rr spqrj 
suss> 3 iu 51 sir dLp iSlrfldauuLdQsfilLLi—a)] <otsst^] ^rf^Lb. <j>/<5sot value ld^iul/lI) 
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[0,0,37] <o7W/ry 2_6Y7W0/. ^I^rrsu^j LD6V&Sl(51S)3id(^Lb, {f,nLD6$)[jd(&)LDn6$T <5/761/<95 

<oTGmG$rf\dG$)3, 0. Qtjrreorrsud&rrm STsmsvsfld&na 37. £g) 0 /Gisi/ 90 l^gsvm Qurr^rr srm 
(fpt^siy Q&iuGuajjD&nGsr decision node ^0 lo. £g)( 3 <$ (Lps^jDudev 
sussirruui^ddlevisnsYT LDjnjn nodes 2 _(^sunda>uu(^\S}ssiiDssi. ldidit) features-zi) 
( 3 &rrS}ldg,iJu($)QG$T(DG 5 T. susm/juui^^^lsir <95OT)z_£) dls^erruSlsv 90 l^gsvm ldsv&SJ ^svsv^] 
g,nLD 6 &)[J <oT 6 $I (Lpilj-GlJ Q&IU 610 /D< 95 nssi decision nodes ^SS)LDrB^J 6 YT 6 YTS 1 fr. ^I^nsu^j 
a&r>i_d) Qss)snu 5 ]&) ^i_Lfi)(rTjr5g)] sueowna, ^<sbsn 3 nodeS-si), ^j^sir value 

LD§)IULI&><o5)6YT <5 61/(Ssf) <5 <5 61?/_D. LDGV&Sl <5T6VT (/pti?_61? Q&UJGlJ&ifD&rtGST ^l_£5$5l6V 34 <57637 

Q/_d/7^<5ld/7<s ^svsvnLDSv, 30, 3, 1 <o7w ^&sf]^^ssf]ujrr&u lS) <g£\{ 5 &)G$)g,uj 
decision nodes-ga ^(r^sunddULierrsn^j. ^sijsunQir) 6i/<su/ii)0/50/ £g)/_/_D/7® 2_6Y7W 3 
nodeS-si), ^rrmssirj sism (LpL^si^ Q®/u610/d®/763T £g)/_<5£5)<si> 41 <oTssi Glmn^LDnai 
^jSVSVrTLDSV, 30, 8, 3 <5i65T3) a)G$f}£&)G$?\ujrrg,ij iSlrflJB&j ^(rjjGurrddhLisherry]. ^rssrQsu^rrssT 
g)OT)sw«syfl<s37 entropy 0 m/D/ry/i) QrBQ^djdhu LD^luurrai 2_erTsn^]. 


Information Gain: 


90 0 /? 5 )/j/_S)/_!_/_ iSljflsfilGV ^U<^ 3 ,SS)Sn SUSS) 3 ,U /-/(/)) =$ 5 / si W? 5 <S‘$ G< 565 ) 61 / UJ ( 7 <S 57 

ddsuijihi&GsxsfT <oTr5&, c 9/sirsiy<g50 90 feature ^9 /sy fiddijo^j <oT 65 tuQ&> Information 
Gain <5Tmuu($)Lb. ^)0/siy/i) entropy-g/j QurrsirQjD ^rjsi^aisisvsn^fhurrai 
suss) 3 ,iju(^]^ s_<5Siy/_D 90 metric ^/,0/i). entropy srsiru^j impurity ^/,0/i). 
^)«n<5 «nsi/<50/, ^>//5<5 impurity-^*.® 0«n i ®/j/j ; 5 i 250 ^_<5siy/_o metric <5/7657 gini 

gain < 5 rmuu(d)Ld. ^ 3 ,jD&rTm <sumuuurr(^) /_ 5 ) 65761 / 0 /_D/ 7 /ry. 


Information Gain = Parent's entropy - child's entropy with 
weighted average 

child's entropy with weighted average = [(no. of examples in left 
child node) / (total no. of examples in parent node) * (entropy of 
left node)] + 

[(no. of examples in right child node)/ (total no. of examples in 
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parent node) * (entropy of right node)] 


= (37/112)*0.0 + (75/112)*0.994 
= 0 + 0.665625 
= 0.665 


Ql£>(£)s>ss5ti _ fffu&Slsv DecisionTreeClassifier() - <5(<9)(5)Tcriterion = "entropy" 

<oT<J37/_/<5/7)0 U§)l6Vn3i "gini" <OT<J37<95 Q<95/7(5)<5^y Uu51/0& ^STTl^^rTSV, gfni-®£><95 

<5 <o35T<5 ® L. (J) iShsiTSUQ^LCirTrU ®«fXSY7<95«fX5Y7 5_0Sl//7<95®<95 <95/D ®/7) . 



Random Forest (yis&iiBifilsv a/oigLC) model-issr Gusmrjui—Lb iSlsirsuq^LDrT^j . 
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Random. forest-oi> umSjifiig ^rjs^^stfieb ^shsn <s£®jQ(si//70 feature-ii 

6I/<o5)<a/j/J(5)^Sy < *0 <oT[53) <31<SYT61ld(& ) UthJdoGfiiiB&JSrTSfT&J <oT6$TUG$)3) iSlm<SU(VjLD 

6u <S5 )iju u l §)lsv airTsmsvrTLti.. 
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19. Clustering with K-Means 


Unsupervised learning-si) prun <s®<* ^^d^id (Lpa,&v algorithm ^j^jGsu. 
^j^isnssxj rBrnb &smL_ ^ssxssr^^iLb supervised- ssr &L£ s^ldil/ld . logistic 
regression, multi-class classification Gurrmp zsheifQiX) 

LDjDruLD Qsueiflud(^(Y) ^ijsms&ii—iLiLb G]a>rr(^i^§] uudl/nSd ^sdluGumd). usvGsuru 

GlGUSTThtfL-® 61/S5)<95<956yfis57 dLp ^fJS1^3,SS)Smj l3ftUU3,IB(& ) ^^^GS)SST SUSS)3,UJrrSST 

<oTsbsis)sv 3 iss)<SYTiLiLb rsrrGLD suss)fjujss>jT) Gl & ili G <su rr ld . ^ssmeb £^[53, unsupervised-si) 

Glsug^jid 2_6YT6r?(£)g;6rT ldlGQGld G)a,rr($)d 3 ,iju($)id. <oT^^ss)Sst suss) 3 ,udlsb idflda; 
GsusmQid srsiruG^rT, ^surorSlssr <orsbss)Sd 3 ,sn <oTsstsst ^rssruG^rr Q&rrQd&Ljui—rr&j. 

GunsirfD clustering-si) STSusinsuasirK-meanS (ipsi)z_o/7<s 
&><smd&)i_uu(i)\&)<s$i[D®n. STsi/si/sirsiy snss)s>s><dfl(sb iSlfida Gsusm(^iid (oTmuona, elbow 
method-?//)SDLD asmddUsvmd. ^j^nsu^j s^0 snss)rnuss)p)ss)ujd Qs>rr(^i^^i a>[D&>d 
Gl an si) svi si/gj/ Supervised (ormpneb, <oTsiisdl <5 sus^rjiussifou-lid £g) svevmDsv s>d)s>d 
Giemsywisua,! unsupervised ^^lo. 

tdLfidasmi— 2 -a,rTrj 6 md>$dhsvXl, X2 <oTssiiid (j^rrsmCh) ^iid^rhianshi features) 

G} 3 ,rr(^]d 3 ,ijuid(^]snsnssT. Y srswypy <o7g/si//_o £g)si)«r>si). ^^rrsu^] Qsur^jid 
2 _ 6 rTGf?Ld($)ds,rrG 5 T {BijGH&Gsxsnd Gl&rrsm® ldlGQGld prrLDrr&Gsu usbGsu^j (^(Lgdai&iflsv 

< 3 ]g>JJD 6 v>JD 6 UG$)s,Liu($)£d)id G]a,rr(^d 3 , Gsusm^id. 


xl = [15, 19, 15, 5, 13, 17, 15, 12, 8, 6, 9, 13] 
x2 = [13, 16, 17, 6, 17, 14, 15, 13, 7, 6, 10, 12] 


<g£\3)(D&>rrG5T rffrjsv LDtDQjid sdlsndand LSlssTsuq^LDrrg^. 


https://gist.github.com/ 

nithvadurai87/185e332ebce7028af265adbe86db40d5 
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import matplotlib.pyplot as pit 
import math 

def plots(clusterl_xl,clusterl_x2,cluster2_xl,cluster2_x2): 
pit.figure() 

pit.plot(clusterl_xl,clusterl_x2,'.') 
pit.plot(cluster2_xl,cluster2_x2,'*') 
pit.grid(True) 
pit.show() 

def roundl(cl_xl,cl_x2,c2_xl,c2_x2): 
clusterl_xl = [] 
clusterl_x2 = [] 
cluster2_xl = [] 
cluster2_x2 = [] 

for i,j in zip(xl,x2): 

a = math.sqrt(((i-cl_xl)**2 + (j-cl_x2)**2)) 
b = math.sqrt(((i-c2_xl)**2 + (j-c2_x2)**2)) 
if a < b: 

clusterl_xl.append(i) 
clusterl_x2.append(j) 

else: 

cluster2_xl.append(i) 
cluster2_x2.append(j ) 

plots(clusterl_xl,clusterl_x2,cluster2_xl,cluster2_x2) 

cl_xl = sum(clusterl_xl)/len(clusterl_xl) 
cl_x2 = sum(clusterl_x2)/len(clusterl_x2) 
c2_xl = sum(cluster2_xl)/len(cluster2_xl) 
c2_x2 = sum(cluster2_x2)/len(cluster2_x2) 

round2 (cl_xl,cl_x2,c2_xl,c2_x2) 

def round2(cl_xl,cl_x2,c2_xl,c2_x2): 
clusterl_xl = [] 
clusterl_x2 = [] 
cluster2_xl = [] 
cluster2_x2 = [] 

for i,j in zip(xl,x2): 

c = math.sqrt(((i-cl_xl)**2 + (j-cl_x2)**2)) 
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d = math.sqrt(((i-c2_xl)**2 + (j-c2_x2)**2)) 
if c < d: 

clusterl_xl.append(i) 
clusterl_x2.append(j) 

else: 

cluster2_xl.append(i) 
cluster2_x2.append(j) 

plots(clusterl_xl,clusterl_x2,cluster2_xl,cluster2_x2) 

Xl = [15, 19, 15, 5, 13, 17, 15, 12, 8, 6, 9, 13] 
x2 = [13, 16, 17, 6, 17, 14, 15, 13, 7, 6, 10, 12] 

plots(xl,x2,[],[]) 

roundl(xl[4],x2[4],xl[10],x2[10]) 


(Lp^&Slsv XI , X2 <oTgn//-Q (g) rrsmfh) srsijisurTrni ^iss)LDih^,isnsnssT < otg $ tug $)& 

scatter plot Qp&VLb ^nsmsomb. &j&h&r>iLCj Glan^^leo <oTssiQssississi 

^Lb&lhJ&GSVin (5SI/ 6m® 10 <oTG$TU3)J <95 UJLJU i_ Sl5) 6VSS)6V . (oTSmQsU ^SS)SU 

3,rrs6]u UL.L^iusvrr3i ^e^iiuuuuChlSlsimm. 


plots(xl,x2,[],[]) 
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19.1 Centroids (fBlGwflajd&rresr l/ erred)) 


clusters-®*) 2_06i//T<9506i/<5 i /D0 (Lp^&Slsv XI -sSlq^rB^j ggj pem® 
<oTsms>ssxsmL\Lb, ^C2-&Sl(r^rB^j £g)/r< sm® <oTsmassxaniLiLb random-^* Q^rfsi^ Gaiuiu 
Qsusm®LCi. (Lpg,<sv G and, §]&(&) Xl-<a5)0/50/ 13-®£>tiy/_o, X2-si5)0^^y 17-®otiy/_o G-g/rsiy 

Gaiu^jehQmnLb. ^eijeunQ/o ^jijsmi^nsu^] Gan^ajd(^ Xl-<a5)0/50/ 9-®£>{iy/_o, X2- 

s*5]0^0/ 10-g>Jiy/_D GaiuajsnQsnnLb. ^smsuGuj ^is^sfluL/danm Liehsiflash 

(centroids) <orsirjD<ss)LpdaLju(G^sirjDsifr. ^j^nsu^j ^gujds&iib ^uj-uussn^iuna 

6F> 6i0(3<5 ^j&D&n^&D^iL/LD rsnid gg)rr<sm(G Gand,g,nau iSlfildau GunSlGtonid. <oTsstGsu 

^\fjsm(G ^Lbatdasdlsb ^shen s^suGsurrq^ ^ije^a^d^Lb, G^tj^G^^daLjuidL. 
^rrsm(G ^ismsfluLiu nsnsdla^n,d(^LDrrssr a^jfjLb dipdasmL. suniuuun® ^Lpsvid 
asmddU—UuQdliDgi]. 
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gijrjLb 1 = (xldata - 13)**2 + (x2_data - 17)**2 
§wjih2 = (xl data - 9)**2 + (x2_data - 10)**2 


xl 

x2 

grfTIJUDl 

grfltjli>2 



(15-13)**2 + (13-17)**2 
= 4 + 16 
= 20 

(15-9)**2 + (13-10)**2 
= 36 + 9 
= 45 

15 

13 

Sqrt(20) = 4.47 

Sqrt(45) = 6.70 



(19-13)**2 + (16-17)**2 
= 36 + 1 
= 37 

(19-9)**2 + (16-10)**2 
= 100 + 36 
= 136 

19 

16 

Sqrt(37) = 6.08 

Sqrt(136) = 11.66 

15 

17 

2 

9.21 

5 

6 

13.6 

5.65 

13 

17 

0 

8 

17 

14 

5 

8.94 

15 

15 

2.82 

7.81 

12 

13 

4.12 

4.24 

8 

7 

11.18 

3.16 

6 

6 

13.03 

5 

9 

4 

8.06 

0 

13 

12 

5 

4.47 


£g)/5<5 Q&rrpgid&efiev (Lp^sv Q&>rrd&lG$)i6$)i—UJ gtfrjw (g«o© surra ^jp^rB^rrsv 

<3irB3,u nsrrsfii&srr (Lptpsb Q&nd&iGond, ^jsv&sxsvQiu&sflsv ^jpsmi^rrsu^] Q^nd^isonij 
^ss)LLdQssTfDssr. £g)s5)si/ (LpemroGiu LDfdrj&m LDro^jii ms^rr rff/o^^lsv (3z_D/D<s«ran_ 

/j/_< 5 ^si) 3,rrL_i_ijuL-($)6rT6fT& t i. ^GUGurrtDrrs, (Lpipsb Q&rr^&jd&rrmxX, x2 LDipruLb 
^fjGmi—rrsu&j Q&rrfB&jd&rrGSTXX, x2 ^tsstqj 4 ^Lb&ihJ&sn 3,smdQi_uu(^]QssT(DSST. 

<3)6$)SU (LpS^foQlU Lj6YT®lfl SUL^S)dsVILD. [bLd&d&ItT <SUL^SidsVILD 6U6$)fJUI_LDrr3, SI IGftfJfb&J 
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& TT lLi_UU ($)§>} GST[D GST. 

clusterlxl = [15, 19, 15, 13, 17, 15, 12] 
clusterl_x2 = [13, 16, 17, 17, 14, 15, 13] 
cluster2_xl = [5, 8, 6, 9, 13] 
cluster2_x2 = [6, 7, 6, 10, 12] 

plots(clusterl_xl,clusterl_x2,cluster2_xl,cluster2_x2) 



£g)ajsi//77D/7<95 (Lp^sSlsv {^ugsst® Q<a/7<5^/<s<g5sy7'5_06i/f7<s<a/j/j/_!_L iSIgstgst tj, 
^sufDji^sSl(r^rB^i lSgsstQlc) ^tjgsstQ) ^iGssfluLiu LjsrrGfii&m Q^tj^Q^^ds.Lju^QssTfDssr. 
^Gsmsb ^ld(lpgs>jd ^jgsigu random.-^* (3^/jsiy G)&uj ujli u(j)i<su§)1 <sv<o5)6V. £g) tjgsst® 
Qg;rT3,&idgi6frlGViLb ^gsild^^igytgttxI . x2-<s<s/ras7 mean. a,Gssr&Qi_LjuL-($\ < 31 gs>guQuj 

^1 GSsfl LJ L/LJ LjGYTGlfl3iGTTTT3i <31 GS> Lb Si GST IT) GST. GTGStQgU &jGSTGS)ILb &(D(T)] 3j] 6V GlShu Lb IT GST ^IJGSST® 
Qairr^^jdaiGSiGn TBrnb 2_0<si//t<95<95 (LpL^iLjLb. 
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clxl = (15 + 19 + 15 + 13 + 17 + 15 + 12) / 7 
= 106/7 
= 15.14 


cl_x2 = [13 + 16 + 17 + 17 + 14 + 15 + 13] / 7 
= 105/7 
= 15 


c2_xl = [5 + 8 + 6 + 9 + 13]/5 
= 41/5 
= 8.2 


c2_x2 = [6 + 7 + 6 + 10+ 12] / 5 
= 41/5 
= 8.2 


iSlsirsisrrj LSsmQLb GpsiiQsunQj) data-<*(g/_o, asmi—jflrdg, ^isssfiuLiu Li<srr<s/rl3i(^d(^LDn<Gir 

gjjUjLD &&md&>)i_Lju®!d)fDg}].<3i§)i6V (&,g$)(dgwtg$t ^Gtrsy g^uid Q&rrsmL- ^/jG^aush 
<3i6U(f)(r)]dg,rrG5T Qairr^^lsv ^GSxsmQGSTfDGST. Gusurr/orrai iSsm^ud £g) rjsm (£) 

Q^rr^^j^sn ^0 si/ /7<s<s/j/j@£5/0^5/ . £g)«osi/ &rr6n&,G$)Gn (g)ga7gn//-0 ^jdqj ^]sv&ShuLDrT3iLj 
iSl$]uuss)^d airTGmsvrTLD. 
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^sijsurTfDrrai^ ^rjs^aish /Brnd^rfliu Qarr/b^lsv &flsuiju Qurr(r^[B^]Lt susmmiSl&viLCi. 
^)<5«n«j7(5uj / 5 / 7 L 0 Qg,m_ijd!d\ujrr3,d Qgiu^i Q&rrsmQL- Q&<sb<so<somh. ^j^jQsu 

clustering with k-means <otsstuu<@Si/d^]. £g)£ 5 )<si>k <otsstu^i <ot&,&,6$)6$t 

Q&rTfsgjgim/(ffjQ-pd&m s_0si//7<s<s/j/j/_ Qsusm@Lb ^rssruss)^iLfLb, means srsiru^] 

spsijGhsurTQTj features- siy«n/_uj &rjn&ifl<5muiLiLb g>sm@ iSl^i^&fTU/uu&ni^uSlsv 

(i ^(Lpdae&KsrT ^0<si//7<S(§6i/«r><5(L//_D (^/SluiSl@dl/o^i . s>0 @d>g,&,ng> £g)/5<5 k-<sw U}@ui3ss)<ssi 
^rsusurTQj 3ismd@(@<su^] <oTG$rtQ] urrrjdaiGvrTLb. 


19.2 Elbow Method 


£g)#i/ Q<s/7(p<95<s/j/j/_!_z_ <5iJ'6i/<S(SY5<S(g <oT^&r>sm (^(Lgdais^en 2_(r^surTd@mrTsv g/fliurrg; 
^(njd^id < 5 Tsiru<o$)&) su< 5 v>rjui_Lb ^Lpsvid g,smi—/Sluj 2_<g<siy dl/ogi. QLD/Dff>smi— <jy(3<5 

^/ 7 siy< 95 «f)sy 7 '^)/ 6 y 0 z_o rsmb uiusiru(@^@ld Q&rTsrrmGvrTLb. 2 (^(Lgdgush <oTG$TU6$)g> £g)#i/ 
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r5LDd(&) ^rfhurraid 3imd(^)d]fD^rT <6rmu urrrfdaisvrTLti. £g)<g/?5<95/r637 rffrjsv ldjdqjld s^sudaob 
lSho5T 6U (IT) LD IT IQI. 


https://gist.github.com/ 

nithvadurai87/10b5b273151c80be97579d684279cd84 


from sklearn.cluster import KMeans 

from sklearn import metrics 

from scipy.spatial.distance import cdist 

import numpy as np 

import matplotlib.pyplot as pit 

Xl = [15, 19, 15, 5, 13, 17, 15, 12, 8, 6, 9, 13] 
x2 = [13, 16, 17, 6, 17, 14, 15, 13, 7, 6, 10, 12] 

X = np.array(list(zip(xl, x2))) 

distortions = [] 

K = range(l,8) 
for i in K: 

model = KMeans(n_clusters=i) 
model.fit(X) 

distortions.append(sum(np.min(cdist(X, model.cluster_centers 
'euclidean'), axis=l)) / X.shape[0]) 

pit.plot() 

plt.plot(K, distortions, 'bx-') 
pit.show() 


^j^lSVXl, x2 <oTOT)/10 (i^TTemCh) ^ILb&thl&i&TKLb IlUmpy ^Lp6VLDX <6T6V)ILb 6pG)TJ StitifllU TTSi 

LDrrfDjDLju(^)S]jD^j. iSlmssrrj ^{£3,ij6n&>6$)6fT& Qarrem® kmeanS-<S(gL) uudljod) 

^errldgiLjuQ&jDdi]. ^uuuSlfDdhurrm^] 1 (ip<5< si) 7 6uss)Tj usvQeuru GTem&ssfld&naudsv 

(dj(Lp&>£T>6S16YT ^651LCid)&j] UuSljod) ^STTlddljD^] . 6$6UQ6UTT(Tlj (Lp (off) {D tl/ LD <31f£6$T 
<5/7'61/<95(6y7><95(gZ_0, $dl6ttfl61jLj H6TT6d}d(&)LDrr6$T 6l5)<SD<95615 6T6i]6U6YT6^ gJTIJLb <g£\(JT)dQ(D3)l 
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<5T6iru6ing,d aismdSu^lSljD^]. ^susurTforrs, Grrsg, GTGm&ssfld&n&udGV 0(7£> <95 <9565)617 

<31G$)LDd(&)Lb (3 l//t0 / eupn 5 ) gvi err err ^fjsi^ 3 ,sd\ssr sfilsv&sv gtgstu 0/ 

3 ,sm(^iLS]L^d 3 ^uu(^iS]jD^i. $)[ 5£5 sfilsvasv ld^IuQu COSt / distortion 67657/ry 
^ss)Lpds,uu(^]Q(D^i. 


iSlGSTGSTlj <g£\G$)GU 6£0 GUGSXJUI^LDTTS, 61/65)0 UJLJU ($)®657 0 )657. £g)0657X ^ddl(SV 0(7p<95<956lf)657 

<oTsmsmf\dss)3,iLiLb, y ^ddlsv ^tbgst sfilsvaisv /_d£5)l}/_/<95(60/_o ^ 65) td® 6570) 657. 67657(361/0/7657 
s^Qrj 6^0 Q <95/70 £ 5 ) si) <^65)65700/0 {Bfjsn&GsvsmLjLb ^s^LDd^idQurr^] ^&>G$)i6$)i—UJ 

centroid-615)0/50/ ldjd/D 006i/<956if)657 g£Igo&>go ld^iuli 5-<95(5 ( 3 / - 0S1 j <95/7 l_®6i/65>0(7//_d, 

<^9/0/(361/ 7 06sf)0065f) 0 <95/700/<5 <9>617/7<9>/_j L$)/f?<95070O U 00/, <4>/065)/65)/_t7/ 6l5)si)<95si) TD^/JL/ 
l-<950<95 SdLp g,md($) 6 UG$)&)iLjLb anTGmGV mb. £g)/50 sussxjui^id urTrjuug,jn(& ) 6^0 (Lpipidesis, 
6i//j/_6i5)si) (Ffj Liu&) nso, ^)0/ Elbow method 67657/7)/< ; 9/65)7p<95<95/j/_/(®®0)0/. £g)/50 

su<ss)uui _0£5)651 X~z^dSdlGV 2 676570) L-isrrsrfluSlev <31 fig, (LpLprdsmg; Gumsirfo smqsmd LDi_d]Q 

6l5)/f)61/0/761), ^9/00 <5T<oSST<oSsfld<o5)3iuSlsV 0061/<95681617/j /_5)/D00/761) (3/J/70/7O 676577965)0 0/770 
Q0/D/50/ G)<3>IT6YTGYTGVITLD. <ojG)SKTGlsflGV $]3}ID(&) QlDGV Q&GVGVd Q&GVGV 6l5)si)<95si) TO £ 5 ) 79/_/<95 617 
6p>061761/<9>(3<95 065)0)® 6370)637. £g)00 Lj GYT GlfluSl GV djlTGir ^9//50 (7p7p/E/65)<95 LDI—dj^LC) /#65)6l) 
«0/D79®®0)0/. 67637(361/ 0061/<9565)617 2 0(7£><95<956lf)si) iSlfl^^rTGV <97$ U9/7<95 £g)0<9507O 67657790/ 
<56337(J)L5)/0<5<S/_i/_/ (J)® 0 ) 0 /.. 
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19.3 silhouette coefficient 


<s£0 algorithm-^ Q&iusvSdjDsir stssiu^i srsusums ly §urjih &rfhunad 

aiswsfl^^jshm^] <oTsstuss)^li Qurrrud>G>£!, <3]s$)LdQ(D3)1. ^j^jsussuj rsmt) <s<osk/_ 

^S$)S5ld&\<SVU-t>. algorithm-^ <S<J5^/j/^<S<^OT^<o5k«r>/_OU_//T<o57 LD§)lUL-l3>(<mjL _ SSI 

QutSlid® ^g,s$r Q&iu&vtBljBS&iSKrd asmi—fflrsGfbrrLb. ^ssmsb k-means Gunssr/n 
unsupervised learning- < si > spu lSi Qsug, « 5 0 pidLSU—Lb ^rrs^^sh sj^jld ^svsvrra, 

g,rrfjsmf£&)rrsb, ^g,smsmd 3,sm(^)iSlin.d3, s_^siy/_o sp(Vj suLf\(Lpss)(DGuj 

silhouette coefficient ^ 0 ^. 
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^I^rrsu^j k-m.ea.ns (LpssijDuSlsb suss)guu(^i^^uu(^iid ^rjsi^gsn, gfikunstr 
(LpsiDfouSlsb^nsir suss)guu(^i^^uuLd(^isnsn^rT <ofssrd asmi—ffliu <ajjD^ssrQsu distortion 

<oTSSTjD 6$6$T6$)(D ^SfT<3)S]ldQl_md. <S£ffljQ<Sl/fT0 ^fJSl^ld <3]ff>6$T $£lSttfl Si /LJ 

Li<sh<s/rlu51&Sl(r^rB^i loTsdsusns^ 0 [jrnd S)dhsvd]u51(r^dd]fD^i <oT65tu6$)&> ssisu^^j, kmeans-<s5r 
Qgiusv^ljD<ss>sifrd gsmdd)(^id]iD§]. ^^jQurrsvQsu ^jpg, silhoiiette_COefficient 

STsiru^] iSlssrsiiq^id sumiluurr® ^Lpsvid &,tjsyg ,sit <jy«r>/_o/ 50 /shear sirQsu /T 0 0 (tpsiy/_o 
srsususnsy g,dd\^LLrrg,u iSlfildgujuLdQdshsng] srssruss)^d gsmddlt^dlfo^]. 

ba / max(a,b) 

$}§)l6V a <oT<dru^ <qGij (§(Lg®Sl<sb 2 - 6 n 6 fT g,rj6ija>(Qn)dd}6$)L-GiLiujrr65i grjrrgifl gflrjLb. b 
srshu^j s £0 0 (^xsi 5 ]ff 50 z_D <s>]g,ing;(j))d>g> (^(Lp^JD&id ^smi—Giu shsrr 
^ijG^gi^ddl&ni^QujujrTm grjrrgrfl gtfjrjid. 

dLpdgsmL- sr^^^jdaundL^sv [bld^j g,rrsijg>sh, kmeans Qpsvid (Lp^&Slsv 2 

(g,(Lgda>6fTng>u iSlrfldauuQdleirjDSKr. ^susurrQir) for loop ^Lpsvid 3,4,5 

LD/DfQji-i) 8 (§(Lgdg>6nn&>u iSlrtldg>ULj(j)\£)6$i[D65i. £g)/50 loop-<s0ffiT (3j(Lgdg>sh 
QgrTQdguuLdL- srsmsttfld&naudlsv spsuQsurrqr) (Lpsmin ^ss)LmnidQurr^iid, <j>/ 0 / 
^rjs^g&Dsnu iSlffld^id S)S]^^es)^ su s$)ijui_LDrrs, susm/jpg] gmd^Qro^i ldjdqj id ^^sir 

silhouette coefficient ld^ius^u Gisushiuu^^^diro^i. 

https ://aist. aithub. com/nithvadurai8 7 / 

f5f043df412b6e3c8291d0080422bd92 


import numpy as np 
from sklearn.cluster import KMeans 
from sklearn import metrics 
import matplotlib.pyplot as pit 
pit.subplot(3, 2, 1) 
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Xl = [15, 19, 15, 5, 13, 17, 15, 12, 8, 6, 9, 13] 
x2 = [13, 16, 17, 6, 17, 14, 15, 13, 7, 6, 10, 12] 
pit.scatter(xl, x2) 

X = np.array(list(zip(xl, x2))) 


c = 

[' 

1 b', 

' g', 

'r', 'c', 'm'. 

'y', 

' k', ' b' ] 

m = 

[' 

'o', 

's'. 

'D 1 , 'v', 'A', 

' p', 

'*', '+'] 

P = 

1 






for 

i 

in 

[2, 3, 

00 

LO 




P 

+= 

1 





pit.subplot(3, 2, p) 

model = KMeans(n_clusters=i).fit(X) 

print (model.labels_) 

for i, j in enumerate(model.labels_): 

pit.plot(xl[i], x2[i], color=c[j], marker=m[j], ls='None') 
print (metrics.silhouette_score(X, model.labels^ ,metric='euclidean')) 
pit.show() 


print (model.labels ) <oTSSiu^i (Lpg,sb (^(Lpss>su 0 <oT<o57yry/i) ^jusmi^nsu^j (&)(Lp6isvsi] 
1 <5Tsmrj)]w <o7w(3si/ xl LD/p/ryzi) x2-<si> 12 ^/7siytS /d (oTrfjrfj 

(&)(Lpdg;<5Jrlsv Q^dd^uuid^isnsnssr (oTssiu^ild < 3 i^ssi coefficient ld^Iulild 

iSlsiTSUQ^LDrrru QsusrHuu^dljD^]. 


[1 1101111000 1 ] 
0.6366488776743281 


<31susurrQfD 3 @( Lgd&Gnrr&ij iSlfld^idQurr^j 0 (Lp^ev (^(Lp&nsuiLiLb, 1 ^usmi^rrsu^j 

(g(L£)OT)6i/{i//_o, 2 ^LpsirjDrrsu(^(LgssxMiLjLb iSlG$T6U([i)LDrrrru (^j^uiS)(^)d]jD ^]. 

[0 0010002111 2 ] 

0.38024538066050284 

^gjlGurTsirGjD 4,5 z_D/?5/ryz_o 8 < 3 i<sns)S]Gb (SjQ-fyd&rrGfTrr&iJ iSlfld^idQurr^j {Bfjsn&sn 
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Q&ffrBdj]sir syt (^(Lgdansrflsir ld^Iulild, < 3 id(&)(Lp«filjng;rTm Gl&iuev$ 51 / 0 sir ld^Iulild 
/_5)«tf6i/0z_D/77ry Glsusdluu(d)^issTrr)SST. s^vsu^^ju urrrjd^LbQurr^j 2 @( Lpdansnaaoj 

iSlrdd(&)Lb (Sung] ldlLQQld, ^^la, ^snsp Qaiueb tdl/nevism (0.63) 

Qsusdliju(d)^§jsuss)^d aimsmGvmd. 

[2 0010002113 2] 

0.32248773306926665 

[2 4010402113 2] 

0.38043265897525885 

[6 7343162045 2] 

0.27672998081717154 

SdLpdaismL- Gusm/juui—^tslev (Lpa,GvrTGiJd,rTa; ^snsn^j Glsuruid ^fjsi^a,^TT,da,rrssT ui_ld. 

^1 u<o&sr l _ rrsud)rra> ^snsn^j 2 (^(Lgda,rrsnrra,u iShfid^LbQurr&j GlsuerHuu^Ld 6i/«o/7L/z_z_o. 

< 3 2-<dr<sn3i] 3,4,5,8 toJsm&asfld&namSlsv (^(LgdaisisvsrT^s^md^idQua^] 

QsuerHuu^dhsiTjD sus^pui^rdansh. ^^la,uida : LDrra,Q @(Lpda;&n glisvij a,rjGya;&n 
iSlrfldanjuQdlsiTfnsvT. <ojm(3su s^sijGlsurTQ^ 0(tpsi5)gy/_D 2_snsn ^psi^a,ss)sn 
e)d^^lujaa : Lju(d)^^ld a,mdi_, 8 rff/o GU&m&mtd&iQnjid 8 QsusijQsuru suL^surdan^Lb 

Qa>rrsmL_ ^pem® uidL^iueb ^(r^sundanju^dljo^j. ^i&svsu spsdQsunsirjDnai loop- 

d(^srr Qa : sirrrp iSlsinsuQij marry QsuerHuu^dlsiTjDS^T. 
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20. Support Vector Machine (SVM) 


Support Vector Machine (SVM) < 57657U3;d 3, [J61/3)63)677 6176S)3>L7U($)d,§dlj 

iSlrfluug,jD3;n<5Kr spQT) 6ULfi)(Lp6S)fD ^(^ld. 6j/d(^3>657G6u ^^jdGI^&st logistic 
regression 67657U63>3,u u/orflu urr^Q^mb. ^jjGsrrrsb $][53) SVM 67657U3/] 

6U65)3>L7U(^\^^1^6b <5763)1 LI) G 61763)6V63)177 10giStiC-®9 (Sl57/_ 63763)1 id) &JOg}] 3jJ<5V<S§Uu LD 773) 

^63>mdQjD^j. GptjG&nQi QpsoLb LShfid&uuQ^Lb large margin 

classifier 676i76U77Qj 2_3,6ii§>liD3)j 67657U6S)3,iLiid, GrBrjGan® (Lpesi/oudlsv iSlfidauui— 

(LpLTj-UJI 73, 3,[J6113)(671jd(& ) kemelS <5761761777 try 2_3,611§>llD3}l 67637 U 63)^ Ug id £g) UU (3)§?l udlSV 
3) T76337<5V 77 id . 


20.1 Large margin classifier (linear) 


§Lf)d3)6357L- 2_3,T7rj6357£3§?l6V 6£(77) GrB[jG3)T7( J) (/IfiSVd 6U63>3)U U@/B3, (LpLTf-lLjd 3, [J61/3) 63)677 
logistic 6761761777^] L§ fid §>1/03)1, SVU1 6761761777^11 iSl fiddl/D^J <57657U63)3jd 

3)77LdL^iq677^67777111. ^j^lSVXl, X2 6763)ld (^] T76557Ch) li)3 : th73,677 2-677677657. <3163)617 2 

U film776357fhl3,677 (2 dimension matrix) G)3)T76557L_ tyG/J <31655?kUT73, numpy £Lp6Vd 

LD 77JO/D LIU @§>1637/0637. lSl637637lj <313,3,7J6ll3>65)6T7d Gl3)776357@ 10giStiC-<S@Z_0 / SVm ~d(3)d 

UU§lfD§l <3167flddlGlB77L£i. lSl657657[j 6p6uQ 617 T7657IJ}J d ^ [J 61]3) 631677U iSl fit L7 U3,/D3)T7657 
G7377^3)77id-LSj-63)637 3<fiuJ773) 67^7(3) <3163) L£id §>1657/0657 67657U65)3,d 3,776357U3,if)3)77637 rff[J6V 

Classifier()-<S 0 IO 1 T 67(Lp/5UULG(@6776T73 i J. 

https://gist.aithub.com/ 

nithyadurai87/2de5a6a6f7cc03c2791305f5c33d43d7 


import numpy as np 

import matplotlib.pyplot as pit 
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from sklearn import svm 

from sklearn.linear_model.logistic import LogisticRegression 

def classifier(): 

xx = np.linspace(l,10) 

yy = - regressor.coef_[0][0] / regressor.coef_[0][1] * xx - 
regressor.intercept_[0] / regressor.coef_[0][1] 
plt.plot(xx, yy) 
pit.scatter(xl,x2) 
pit.show() 

xl = [2,6,3,9,4,10] 
x2 = [3,9,3,10,2,13] 

X = np.array([[2,3],[6,9],[3,3],[9,10],[4,2],[10,13]]) 
y = [0,1,0,1,0,1] 

regressor = LogisticRegression() 
regressor.fit(X,y) 
classifier() 

regressor = svm.SVC(kernel='linear',C = 1.0) 
regressor.fit(X,y) 
classifier() 


logistic ^LpsoLD < 5 / 761/<95 6)7 iSl fid&u u ® ldGu ng] (SrBrjGj&n® iS) 65761/010/7/ry 

^SS)LdQ(D^1. <^1/0/761/0/ SQLp 2_6)76)7 61/65)<95<950 1/5)<9561/ID Q 0095 9510/795 6761/6)5)0 

£g) 65) /_Q 61/ srrliqid £g )svsvrrLDsv GrsrrGigirT® ^s^LLds.uuid^snsn^i. ^smrrsv QldQsv 

2-6iT6n 61/65)<95<95010 (3 <95/7/J_ 1^43010/7657 £g)65)/_ Q 61/ Slfl 0 UJfT 1/5)9561/10 ^9/^)<9510/795 6)76)70/. 
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SVM ^Lpsvm ^fjsi^3,sn iSlrfld&uuQLtiGiurTgj] ^usm® euss)3,d^m rsQGfilsv ssnsn Garr® 

^GU<oS]fJGm(^] <3USS)3,uSls6]([^^^lLb &LDLDnSST ^SfTSlf gtf!Jff>§)l GV ^SYTGYT^]. <oTGStGgU ^ITSST 

equal margin / large margin classifier cnwgy ^GtnLpd&uuQSirDgj. @i§i 
logistic regression "(5tS/7(o5f optimization-^,®^ 3h^^uu(^Siid^i. 
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20.2 Kernels (non-linear) 


Kernel <oTssru§] GrsrjGarT® Gum. !_(£) iSlrflda (LpL^iurr^ &fnrru 3aq.&STLDrrm non-linear 

(LpsmjBuSlsv ^s^LDrB^ishsn &)ijGy3,6$)GfT suss)3,uu(^)^^]eu^jD(^u UUJsiTU(^d]jD^I. 

Gurrsirir) GpfjG&rruLqsb QurrQ^rB^rr^ ^tjsi^s,ss)smj Qurr^j^^jeu^fD^ 5 J(dQ3,sstGsu 

polynomial repression <o7w_/£) s^ssiss)P[)u umjd,G&,mb. ^ssmeb ^j^lsv e^sijQsunq^ 
features-siy«n/_(u higher order LD^iuLj^sb ^smdSiuuuLd^ ^ss)su l/^)<s/ 7 <* 

^GftGmrB&jsnsn < 3 )Lbg : ih]a,Gnrrg,&, aismddlsv QamsrTmuuQLb. <oTsstGsu ajrjsyasn 

(Lp(Lp^n3iLj QurTq^rB^jLD sussirruSlsviLb square, cube <otssi^j order-si) 

features-^* asmddlu® ^6&>6md>§]d QarremGu Q&sbGsumb. £g)si/ 6 i/mry 
Q&ujujLbGurr&fi uuSlrodl ^sdids,Lju(^Lb g>rTGtfl<sv 5 ^snsi^ ^Lb&rhJ&sh 

G&rjdauuQisugjrTsv, s ^0 algorithm < 95 /D/ry<s Q^rrsnsu^jD^nsst Gr 5 rj(LpLb 3>6$sf)®s?} 

^SS)SST^SS)^lLjLb <3]S)l8> ^GUstfleV /# S5) S376l5) SI) SSKSUjB&jd QarTSYTGYT GsUSmL^UJ G&>6$)6UILjid 
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<3i^3rfiddijD^j. £g)«o< 5 <g g,<sfiirjuug,iB&ng; <sur5g,(3g, kernels / similarity 
functions 

L-I&JI3}] LjS)l&)rr&, ^Lb&thJ&ssxsn <g£\G$)Gmd&,rrLD6b, srTfoGlaismQsu 2 _srrsn^Lb&rhjgnsrflsv 
$IQ5[5£t] L-I^IUJ ^Lb&ihJ&ssxind gssmddlLdQu uiussru^^^jQfDgj. 2-g>rTrjsmJ5g J ]d(& ) [bld^j 

uuSljodl^ ^rjssSlsb 5 ^Lb&td&^Lb 100 Lorr^lrfl^ ^usyan^id ssnsnssr srsirrru 
GsxsudjQidQ&nGnQsunLb. Polynomial <oTSS)iLb (Sun^j <g£]d,3>s$)3,uj 5 featureS-<a@/-D 
square m/D/ry/i) cube us^luLis>sn 3,sm(^iiS]Lq.d3,uuLL(^i, s>ss)i_d\u5]sb 3 jss)su 20 - 
d(&)LD Gudsu/ 7<537 features-^/,* si irs^j rffjoi^Lb. 3 j^j(Ssu kernel ^Lpevid Qun^j^^jLD 

Qurr^j s^si/Qsi//70 <3]ids 1 rhjgnsrfl svild 2 _srrsn 100 Lorr^lrfl^sdlsv £g)0/5#i/ s^0 JBrjsril&nm 

Gg,rjsii G>l&uj&fi <3i3>6$)65i landmark-^/,<s 3jss)LDddljD^j. iSlsirssiij <3i^l&Sl(r^rB^j ldjdjd 

■ 5 / 7 siy<ssY 7 ' (oTsi/si/snsiy ^rj^^lsv ^smDrB&jsrrsnssr srsiru^j assmddU^uu^dljD^]. <3]S5>su 
landmark- <*0 <3j(r^dlsb £g) ( 75 / 5.5 ™i) 1 srsaisiy/i), jg)<susisr&vGhusisflsv 0 sts 37 si//_o 
susis)3iuu(B)^uu(B)d]sirJDS5T. £g)<s 3 ><$ sinsi/^GcS q^liu feature gssmddli—UuQdljng)]. 
<3l^rrsu^j uudljnUdjB d,rjsfil6V ^srrsrr 5 <3jLbg : rdg;(Qnjd(& ) Qsi//ryz_o 5 /_ i^Iuj features 

/_ dl _( 5 )( 3 /_d ^\Lb(Lpss)fDuSlsb 3,SmdQl_LlU(^]QsST(DSST. 


^$\d)3) similarity function-a^/Tsar &LDs$iurr($) iSissisiiQ^usn^]. £g)dg/Gsi/ kernel 
<oTssi^]Lb 3iss)Lpda,uu(^\QrD§]. $J[B3) kernel ^d&smddQ&svsrr rffaiLp^^jssj^jDi^ 

usvGsurru surriLnjun(^i3^ss)smj QujDj^(r^d(^LD. < 3 j^lsb spsar/nnsisr expQ-dansisr 

tfLosirun® dQrp G]s>n(^ids>uuLL(^isrTsn^i. £g)g/(3si/ qauSSian kernel (ojssrn jj 
<3t<55)Lpd&uu(B)£irDgj. G/j/iswffy polynomial kernel, string kernel, chi- 
squared kernel, histogram-intersection kernel srsirru usvQsuru 
susts)& uj nstsr suniLnjunQ&sn kernel-si) snsnssr. 


fl = similarity (x ,11) 

= exp (-(||x-l||**2 / 2*sigma squared )) 
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SVM without kernels <otgstu§i logistic regression-^*® (^rfiddi/Dgi. 

su^j kernels ^psoid 2_06i//7®®/_jf/jzJ_z_ /_ i^uu features-®?/.) uiusiru(^^rTLDsv, 

Qf7,rjLq.ujrT3 > raw feature-®?® Qa>nsm(^i ldlLQQld suss)3>uu(^i^^i^(sb rffaiLprs^rrsv, 

<3]3jj logistic regression-^(5tu (gfflddirogi. <otsvtQsu otl)(3l// 70/kernel-®? l) 
uiusiru^l^^svmi) <oruQun^i logistic-®?/j uiussiu^i^^somb lorsirrrij urrrjuGurrLb. 
(S^^G)^@ 1 SaUULL ^/LD®/Ey®<Syfi<S57 OT<^<J5tffl®<J5)®( 100000 or 100), Ulfiljn&dlTJ) 

<3^ sdld&u u lLl _ L£>rr§)lrflj£ ^rjs^aisrflsir <oTsmsstsf]dss)3>ss)iu{ 10000) <si5?/_z_/9®6i//_o 

<31&>\3>LDn3> ^(j^rB^rrQsvrT < 3 jsvsv^j LS]3>sifLb 0 «o/£)<si//t® £g) 0 / 5 < 5 /r( 3 <su/r SVm without 
kernel-g>/j uiudruQppGvrrid. ^i/^y(3si/featureS-«J 7 <oT«ik«fisfl®«f)®( 1000 ) /_/9®si//_o 
<31&>\d)LDn3> ^jsvsvrrLDGV sprjsrTGydiQ 3 /di jy ^g)/^®/_D/ 7 ® ^ 0 ® 0 /_D( 3 /_// 7 g/ SVm with 
kernel-®?/j uiussru^i^^sonid. 


d^daismi— <or(d)^^]d3imdL^6V usvQeugn <3]Lb&(h]3,6$)6fT ss)su^g] <s ^0 mevij mebsShurr, 

GrjrreDmsurr, ^nLDSsxjujrr <oT^iqj suss)3>uu(^i^^uu(^id]rD§]. ^js&ksu SVm without 
kernel ^/_®®si/^y logistic ^psoid suss)3>uu(^i^^Lju(^isiJss)^siSiL_ kernel ^Lpsvid 

6U6$)&iju($)£g)iju($)LbGurrg)j <313 >g$t accuracy <3i^ls,$]uuss)^d arrsmsvrrLti. 


https://gist.aithub.com/ 

nithvadurai87/9d7cc99cc4ael8a3707cc76f8711193b 


import numpy as np 
import matplotlib.pyplot as pit 
from sklearn import svm 
import pandas as pd 

from sklearn.metrics import accuracy_score 

from sklearn.model_selection import train_test_split 

from sklearn.svm import SVC 

from sklearn.metrics import classification_report, confusion_matrix 
from sklearn.linear_model.logistic import LogisticRegression 
from matplotlib.colors import ListedColormap 

df = pd.read_csv(/flowers.csv') 
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X = df[list(df.columns)[:-1]] 
y = df['Flower'] 

X_train, X_test, y_train, y_test = train_test_split(X, y, random_state = 0) 

logistic = LogisticRegression() 
logistic.fit(X_train, y_train) 
y_pred = logistic.predict(X_test) 

print ('Accuracy-logistic:', accuracy_score(y_test, y_pred)) 

gaussian = SVC(kernel='rbf') 
gaussian.fit(X_train, y_train) 
y_pred = gaussian.predict(X_test) 

print ('Accuracy-svm:', accuracy_score(y„test, y_pred)) 


GeuGifUd®: 


Accuracy-logistic: 0.868421052631579 
Accuracy-svm: 0.9736842105263158 
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21. PC A - Principle Component Analysis 


Principle Component Analysis urfiLDnemrhi&sh Q&nemi— 

<5/761/.aiso6)7 065)0)00 <jy6Y76i/ uifl ld n<oSSTfhJ3iSYT Qd>rT6mi_g,rrg; LDrrroriisu^jD^Lj 
UUJ657U(/J)®0)0/. OT(/J)<55i/<a<95/7L_Z_/7<® 1000 ^9/ LD 3fhJ <9>65)6Y7<5 Q<95/765k(J) 6£0 6l5)61 /<UJLD 

a,6mf]&a,Lju ($)§>! (D3)l <67637 65)61/00/ <95 Q&mshGiiSlJ mb. PCA" <^6570/ £g)00 1000 X-®£> 100 
X-^,<95(3 61 //7 <3i6VGV3)] Cg)657gn /ld 065)0)00 u ft ld msmrhj&m Q<95/76337/_<g/7<95(36i/ ft ld rrror^d 

Q<95/7(p<a0i7). <^9/0/761/0/ Y 6765k655fl9565)9565)U_/U U/nffld 95 61/65) <5U U U/_/70/. QsUQJLD X 
676557655fl9565)9565)UJ LDlL( 7)I LD (^SS)jDd(3,LD. 67657(361/0/7657 PCA 67657U0/ dimensionality 
reduction-<950 s_06i/®6570) 6^0 £f)0>ul/6i/65)95 GULfl(Lpss>ir) ^(QLb. $j3}6ir 
Q&iusvurrQgisrflGV 2_ehsn UL^am iShsmsuq^LDrTru. 


(/p0<Sl5)<si) uu5]jn&£5 0/761/<95651 SYTU Qufb^jd Q<S/76Y7(6006li) (xl ,y 1 ), (x2 ,y2 ), 

(x3,y3)... 


^i/(5)000^>95 PCA ^LpSVLD LJuSlfDliblLB 0/76l5)<oi> 2.6)76)7 X <J>/65) 6570 65)0/1//_0 0109500 
(3065)61//l//7657 c 9/6y761/ 065)0)00 msmSttfldsin&uSl 6V LDn/blT)]ff>6V 


iSlSSTSSTlj 063)0)9595/ 


Lj$5llLI X -g9<95 Q<95/765k(p Ull51jn& ^61^0061) 


Qu/70/61//795 £g)00 PCA ^9/65)65700/ (g)/_95@gJ/lD U UJ SSL U Z_ /70/. 0®^/ ^fl^rr^QeU 
LI UJ sir LJ (7)1 LD. 67(p00/<a<95/7L_(5)<950 lO65f)0 (Lp&lhl&GTT ^9/61)61)0/ 2Mlj^l3Sn QurT65T(D6U(D6V>(D 

^9/65)/_UJ/76Y7/_7L/(J)00//_D alCJOrithm-950 Ulfiljod) Grfld&UuQl-b ^[J<S^3SrflGV 

065) 0 ) 00 LIL-3LD 1 GVL-L&LD featUFeS-^/,61/0/ £g)09501O. <ojG]SSI&sf](sb 6£0 genrjSTludleir 
959595010, 65)95UlS)/i/_, £g)09565)95, U<95<95<95 <9565576557/710.956/7, (LpSST S)Shsnd(^3SYT 


199 



s^si/Qsi/fr0 dlmssid dlssrssr stisqiUJthi&GsvsmuLb <3i6mi_ujrT6nuu(l)ia>a, <3]§)la; <jy<syr sfilev 
features ^<o 5 )z_o; 5 ^>) 0 <* 0 z_o. ^j^jQurrsirjD ^ji—fhi-^&rflsv , ^ss>su ^ 6 &) 6 Krd> 6 V)j£iLiLC) 
uuj 6 iru®j£g,rTLDGV (^ssirorB^ ^sn&Slsb features-®*) LDn/o qjgu a,{£>(&, PCA 
uiussiu(^iS]p§]. siuQun^jLD pca-g>/j uiu< 5 iru(d)^^jsu^fD(^ ty^sirLi feature scaling' 
<oT<di[D spssirQ] 3 ,smL^uurr 3 , r56&)i_QujD ( 3 <su< 5 m®Lb. ^j^jQsu data-preprocessing 

<oT<o$rrru ^ss)Lpda,uu(d)Lb. 

dLpd&smL- <oT(d)^^]d3imdL^6V prod Lirflrs^] Qs.rrsnsnd a^&vuLDira, ^(Vjda; Qeusm(^)Lb 

<oT 6 STu&,jn&ng, 4 : dimension Q&nsmi— ^rjs^s>sn'Z dimension-^/,*PCA ^lpsvld 
LDrTfDjDuuLd(d)<srT<sYT^j. PCA /j{u<OT/j@<g 0 / 6 i 0 /D 0 (ip< o 57 <o 37 /t StandardScalar ^lpsvld 

&jrj 6 i-l{kGYT normalize Q&iLmjuuQSleirjDSKr. iSlsireKrij spQTj LD<svrf LDGV&Shun, GrjrTgorT<surT, 
a,rTLDSS)fjujrT <o7w/ry £5/RO/rasf}<s<95 ^sijS)S]^Lpa,^n,ss)i_uj ^aisv(LpLb, eu/otfilssr Qldjdlui) 

^ja>Lpa>(< 5 Yij<o 5 )L _ iu f§SYT<3]a>6V(LpLC}na> 4 ^Lba : rhjaish2_ehsnsm. ^)«nsi/PCA ^LpevidX 1, x2 

67go//i) ^ijsm® ^Lba : rhjaienaai LDrTjDjDuuQdlsiTjDsm. ^\sus)S\fjsm(^] < 31 Lba : rhjaierrlm 
< 3 ]LZ)-Lju 6 V>l_u 5 l 6 V <3)G$)LDlLILb 3 61 /«f ><95 LDSVrjaH^LD 3 rfffOfdjaiSrrlSV 61/<o50/_//_LO/T<S 61 /«r>/ 775 £i/ 

■s /7/_!_ /_ u u /_!_ (/J) shmaf ]. 

https://aist.github.com/ 

nithvadurai87/20dl8bbda53e43del9222e24d330a398 


import numpy as np 

import matplotlib.pyplot as pit 

import pandas as pd 

from sklearn.model_selection import train_test_split 
from sklearn.preprocessing import StandardScaler 
from sklearn.decomposition import PCA 

df = pd.read_csv(/flowers.csv') 

X = df[list(df.columns)[:-1]] 
y = df[ 1 Flower'] 

X_train, X_test, y_train, y_test = train_test_split(X, y, random_state = 0) 
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pea = PCA(n_components=2) 

x = StandardScaler().fit_transform(X_train) 

new_x = pd.DataFrame(data = pea.fit_transform(x), columns = ['xl 1 , 'x2']) 

df2 = pd.concat([new_x, df[[ 1 Flower']]], axis = 1) 

fig = pit.figure(figsize = (8,8)) 

ax = fig.add_subplot(1,1,1) 

ax.set_xlabel('xl', fontsize = 15) 

ax.set_ylabel('x2', fontsize = 15) 

ax.set_title('2 Components', fontsize = 20) 

for i, j in zip(['Rose', 'Jasmin', 'Lotus'],['g', ' b', ' r ' ]) : 

ax.scatter(df2.loc[df2['Flower'] == i, 'xl'], df2.1oc[df2['Flower'] == i, 
'x2'], c = j) 

ax.legend(['Rose', 'Jasmin', 'Lotus']) 
ax.grid() 
pit.show() 

print (pea.explained_variance_ratio_) 

print (df.columns) 
print (df2.columns) 


Qeuefiu9(K): 


[0.72207932 0.24134489] 


Index(['Sepal length', 'Sepalwidth', 'Petal length 1 , 'Petal width 1 , 
'Flower'], dtype='object') 


Index(['xl', 'x2', 'Flower'], dtype='object') 
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£g)<g|5)y<o5)/_uj Qsu&rihd’LLui.sv srsiru^i explained variance <orsiru^i [0.72207932 
0.24134489] <oTssi sufB^snsn^i. LD^IUL/aisiriSYnLiLb a^LLL^mrrsv 0.9Q 

<o 7 w/ry S 1 / 0 LO. $]3)fD(&) <oT6$tg$t sjsiTfnrr&v <^\sij<3)S]fjsm(^ COmponentS-zi) 

(£<90/50/ 96% 0<®6L/<si)<*<o5)io)T 2 - 6 rTGfTI_&QlLj 6 rT 6 fT&)J <oTSST^j < 3 ]lj&,g,LD. <5jG)m6tfl<5V 
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features-^* (^GSijndi^LbQun^j J5&6U6V ^l£uli <ajjf)LJi _ <sunujUL-i ^srrsn^i. <oT®siCdgij 

variance (oTgstu&j foTGiiGums^ &J 5 G 1 ?&ab ^msuevaieh spsiiQsijnsiTjr^eviLt) 
Q&L£\&S,ljuL-($)6rT6nG5T <oTG$TUG$)3)& 3x^tD 2-{56yS>l(D3)l. ^\SS)^L1 U/OffllLILD, PCA 
Q&ujevuQLb < 3 )S}&)£ 6 $) 3 )U-fLb <j£)(S 5 T&s)i ld <s 3 lmdmLDn&d dQip mnsmGvnLb. 


21.1 Data Projection 


^rrs^msrflsir urflLDnernrhimssnsn (&)6 $)[duu&,[D(&) s_< 5 siy id ^IlIi^Qld Projection line 
^svgv^j projection area <ormuu(d)S]fD^]- d^dansmi— GU<s$)[jui_(hi&,G$)Gn 
s>GU<ssdd^G^id. ^ji^^jLi/Did ^sytsyt ui_d>§ll6V 2 dimension Q&>n<smL_ ^rjsi^n>sb 1 
dimension LDnrofDuu(d)Gu^fD^n<sifr &>hdi_Ld ^snsn^j. ^j^levX 1, x2 (oTssind 2 
^LD&rhiafQnjdaneKr Scatter plot ^srrsn^j. ^su/nnSleir rBQsfilsv ^sisiLDrB^jshsn 

G>an®t 5 n<s$T projection-<s<95/7«s7 ^(^id. ^J 5 §h®ng : Gmu GrsnddlGiu ^fjG^msn 
^ssnssr^gjid Qnssrgn spGij urflLcmsmid Qg;nGmi_g,ng; mnjniBLjuQdlGiTjDGm. ^eiieunQ/D 
GUGVUHfDld 2-6YT6n UI_£5§ll6V Xl, X2, X3 67<gn/fi) 3 /_0 dF/EV <5 (6Y7j <5 <5/7<o57 djIJ 61 /3>GYT 2>-6YT6YTdj]. 

< 3 ]GLi(DrSl(D 3 ,n 6 $T projection area-^ioST^/2 urflLDmsmdjgiGisvsnd Qamsssr® 

<su 6 V)ijul _ d>§)l 6 V mnGmuu(d)<su^] QunssTQj ^G&i—iunmuuQtBgjLjuQdliDn)]. <9wd/?5) U-ishm 

■g/jsi/<956)7 < 31 GsxssTfB&jid < 31 Liutjuurney Q&msmL- u(ff)§)ld( 3 )GYT Qnssr^] 2 u rfl ld nemdimGrr 
QamsmL- Qsijdi—ijng; wnjnjDuuQdlGiTinGm. 
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21.2 Projection Error 


(3i_o/D<95<oskz_ ui—thjgjsrrlGViLb <3]G$)eu ^ss)LLrB^isnsn ^i—^tdljn^idi, 

project Q^iuujliulLl. ^i—^Sdjrx&jLDrrsm «nz_Qsi/syfi(5 uj projection error 

<tf<dir^ ^6$)Lp3,3>UU(i)\&)[D3)l. 2d-g> ld-^« LDrTfD^]SU^fD3irTS1fr UI_d,®S){f,U 
unrfd(^LbQun^j 2-d]3><Qn ) d(3 ) linear repression r§lss)ssis^d(^, GurjsvrrLD. ^ssmeb 

PCA (oTGSiugi linear repression svev. <ojG]ssi<ssf](sb rB(^i&S]eb ^sbsn^dQs>rr(^i 

prediction-<s@/j uiussrui^rrgj. Ghsurruid projection-*^ z_ol!_(5)(3z_o 
uiusiru(^d]iT)^]. ^sijsurrQrD ^dQ3,mdin.ss)Ssr ssxsufB&j Y LD^luLiai&sxsn aismfl^^jd 
Q^rrsvsvrr^]. GlsurryidiX m^uLiais^sn ^ji^LDrrjDfDLt) Q^ujsu^roQs, ^dG&rr® 
uiuGsruQSifDgj. GwqjlL linear repression-si) sum of squares error 

< ormug] ^jsv)L _tj/JLi_ ^[jfj^ss)^ Q^di^^^rr^d 3ismddU^)d]fD^i. ^smrrsv PCA-si) 

projection error <oTsstu&j udsisurndL^sv 3ismddU_uu(d)S]jD^] • I iSlssTsuq^LDrTQj. 
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A 




L'rO^cbo^ 


21.3 Compressed components 


<_9/£5)<s ^gj/sirsiy Q<s/raran_ uflLDrrsmrhjaish <orsusurTQj ffirfliu ^isnsfilsv <*0<s<s/j/j(/J)®/£)d£/, 
< 31 &>Igv 2_6Y7W uLSj-3iSYT <o tsstQsstsstsst <o7w/py iSl sirsu(Vj lo mry urrrjdaisv mb. 

1. (Lp^sSlsv Sjrjsn&sn^ss)&sr^^iLb feature Scaling Q&lliujlju z_ (3 gu< sm®Lb. 

^giGsu data preprocessing ^®s)L£d&uu®§>i/D&ii.(.xl, x2, x3...xn) 

2 . featUreS-«®«OL(5(U(U/7<S57 <5/r61/<Sff)T <oT6i/61//77py ^SSlLDrB^JSYTSYTSIfr < 0 T<dlU 6 $)&,& 

&,rrsm covariance matrix ^(tysijrTd&LjuQldljDgjj. ^fB/Darrsm GijrTiuuurT® 
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iSlsinsuQ^LDrTru. £g)0/(3si/sigma <o7w/ry <3]6$)Lpda,Liu(d)Q(D3)i. 


covariance matrix / sigma = (1/m).summation of(l to m)[x . 
transpose of x] 

£g)/50 ^9/ sssfiiurrssi^i Symmetric positive definite 67 gn/fi) usmi_i QanemQerrerTdjrr 

<oTG 5 iu urrrjdg; Geusm^id. ^LiGurr&jjBrrssr £g)«»0 «o6i/00/ projection-95.95/7<o37 

Q 61/95 Z_OT)/7 2_ 0617/7.95 95 (Lp LSj-lLj LD. 


3. svd() ^jsvev^j eig() toTssiiLD function-g9/j uiussiu^d,^! projections^ 
Qsi/<g5i_(o!f)/7^_0si/f7 < ®<g5(5u/7z_D. £g)« 06 i/ (Lps^jDQuj single value decomposition 
< oTGSTjQiLb, eigenvector ^rssr^Lb ^6S)Lpd3,uu(d)id. £g) 0 / /_S)«s 7 si/ 0 LD/ 7 ffy 


[u,s,v] = svd(sigma) 


£§)g/ 3 < 3 iG$rf\&,G$) 6 n ^_0si//7<*0/_d. U CT«s7L/0/^f7<s37 projection ”9595/7637 < 3 ] 6 V$fl. 

<j>/0/76i/0/ ul, u2, u3...un su&nrj ^(njd^id. t£LDd(& ) QsusmLspiu ^sytsi^ 

features-^ Gd,rj 61 / Q^iuiuevnid. ^0/761/0/ ul, u2, u3...uk - ^l^lsc k <oT6$iu3)i 
<oTsi]Gil6fT<oij principle components <6T6$iu6S)&,d (^j^lddljo^j. £g)0/D95/7657 eumuuurT(£) 

/_5)<sOTsi/0i_D/7ffy. 


principle components = transpose of (u[: , l:k]).x 


4. ^9/(5)^<5^/7<g5 CTsi/si/sYfsiy principle components £g)0/50/7<5i> &,&>6u<sb ^LpuLi 

670/61//_O £1)0.95.95/70/ 67657L/65)095 SoSm^ldhspds, G 6116m ($) LD. £g)65)095 956337(5)/_5)tIJ_00/95 

variance (oTsstit)] ^ssiLpd&iJuQLb. Qurr&jeurrs, 99% variance 

£g)( njd(&)LDniru urrrf^^jd Qarrsmi—rrsv [Bsbsv^j. (oTssiGsu k-637 LD^lussiud 3,sm(^ii3Lq.d(^Lb 

6iimiluuni—nssr&j idlssreuq^LDrrQj ^eisundljo^]. 
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Average squared projection error / Total variance in the data > = 
0.99 (<o7w(3(Si/ 99% ^i/SYfsiy variance-®? ssrsu^^jd QanrsrrdljD^j) 


Where, 

Avg. squared projection error = (1/m).summation of(l to m) . 
square of (x - projected x) 

Total variance in the data = (1/m).summation of(l to m) . square 
of (x) 

k-«S7 LD^IUSDU <^6uQ(S1/(7<o 377D/7<95 5/$<50/ (3z_D/D<95«ren_ SU mil U U nidL^eb Q/_//7(0 <5 £ 5 ) 

< oTuQurr^j <3ig,sir ld^Iul/ 0.99 ® 9 <gj ^rrsm^Qro^j stsstli urrrjuu <s £0 si/«n<s. $]8)JD(&) 
utslsona Svd()-u5)si5]0/5^/ Qurrydlsirfn S ^sssflsmu iSlsirsu(r^Lb sumuuunL-L^sb 
Qunqrjp^! k -sir ld^Iussiu QrBtjL^iuns>d asmQiSlLq-dasvrrLb. 

summation of(l to k) S[i,j] / summation of(l to m) S[iJ] > = 
0.99 


^siisurririrrai <3]i5lg; ^srrsy Glgnrsmi— urflnrrrsmriiaish gasusv ^rpuLj st^js^ld 
r5®ni_QujDrTLD6V (^sdjd[b^ ^si rsfilsv arQ^danju^dljo^].. 
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22. Neural Networks 


ld& sTl£f> gT)/ 65)t _ iLi ^Lps^)sn <oTsij<surr^j 3ijr)S]jD^] <oTG$tug$)&) (LpmQssrmn.ujrr3,d Q&rrsm® 

2_0si/ nd&uuL-L-G&, Neural network <^0 id. (Lp^&Sisv (^iprBssi^ujn^ij 
iSljnd^LCiGurTd)] LCnosfJ&j ^Lpsr>snd(^ qgstiqjQld Q^rfhurr^]. iShsirsmrf ^dlsvishsn spcjh Qpssvsn 

fBfjLDLj (rflg^rjrreir) li^Iili ®SlsqiUjd, 6 $)(f,d ^[DQjd Q&rrsnmd) Gl^m_rhi(^dljD^j. < 3 l(})\d, 3 , 3 ,rr&> 
LDfoQrDrTQ 5 rBfjLDLj ^jroQ^ssrQsu ajorrijd Q&rrsmQmm G&rjfBg] 

£g) sirGl «j7/70 l i$5iiu G)S)G)%ujf£G$)&>iLiLb arofryd QamsrrdilfD&j . £g) sdsurrQir) usvQsu^j 
rBFLDLiaisrT susvxsvuiShsirsisTsv suls^s^sv s^ssrQrorrQi^rrssTQj iSlsmsmdgiuuLd® 

Qf5m_r}ddhurTg; usvQsurjj l / 0 / lvl / 0 / &ds^iurhJ 3 i&r>snd arorryd Q&rrsmQL- su([T)Qsst(dsst. 

£g)«»0 0 iLq-Ljue&U—UjrTa «r>6i/00/ 2_0<si/ rrd&uu /_!_/_ (?0 Neural Network ^0z_o. 


£g)0/ s^ajQsi/f 70 <si 5 )<oi?<u_/ 0 <o 5 ) 0 J 7 //_o suss) 3 ,uu(^]^^l suss) 3 ,uu(^^^ld aijodljo^]. srsmQsu 

^\ 3 >&>T classification problem-^* s^0^)0<*0z_o. binary 

classification -<si>xl, x2 ot<s%/ $F<^®fea.tures-g$)($d£)fDQpGsf}<sv, logistic 
- ^<5370/ ^>/0<o5)<o37 Qf^rjis^iunS) <oT@d>g}id Glarrsm® h(x) -®o ^sssfld^id. ^ssmsb neural 
network- 5 ^/ ) «j 70/ raw features-®9/j uiussiu^^^mDeb ^ssrdQ^ssr 5^0 hidden 
layer-g> s_0si//7<95®<g5 Qs>rrsm(^i, ^j^lsv usv activation unitS-®£> ^(r^surrddld 

&6istifldd]rD3)]. ^)&)jf)gin<o 5 T (&,&>§>! rj id iSlsirsuQ^LDn^j. 
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— fj{Q() a o + 0\a x + 0 2 a 2 ) 

Where 

«0 = <7 (00*0 +01X1+02*2) 

Likewise Cl j & Cl,2 


activation unit-<S57 ld^)l}/j/ 7<s37g/0 qp^sv 1 Gus&irj^smuxsufbmsv, sigmoid 
function -argiiaff ^i&,65)i6$)i—uj parameters features ^ss)ldS\(d§]. £g)«r ><5 

<o<r><si/$jG>5) (Lpdjsv activation WYllt-sir ld^iuli aismdSu^uu^)Si id^i. £g)6i/6i//r(3/D 

S^ffljQsi//70aCtivatiOn Umt-<oOT/_£)£5)LV/_/<S(6}5Z_0 <S<o^<S®i_LVZJ(^®<oOT/£)<J57. 

Parameters-^ S,ih_n <ordrru (gfflpGprrm ^sbevsurr, Neural networks-si) 
^jssisu weights <o7w/py ^ss)Lp^uu(^iS]ssipssi.<oTSSTQsiJ g,6V)i_@hun3; &>6$sf)g>&>uuQ)\Lb 
h(x) LD&juLiai&h, activation units ld/d^ild weights go ^ssxsm^^i 

Sigmoid function-^su g,6$rflg,a,u z_/(J )®sirro <557. 

22.1 Neural Network GtoLbULj 


< 55 ajfluL/<S0^ Q^&osoiunssi fQ3.tU.TQS~sir <orsmsissfJd<ss)3i lCI3hs^ld 

^)0<®@LDGL//7g/logistic-<g50/j uSdtsvrra prut neural networks-®£>Lv 

uiusiruQfbdjGvrTLli. 


Binary classification-<s<95/ra$7 neural network iSissrsu^n^j 
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Multi-class classification-<s<s/7w neural network iSissrsid^n^] ^smAiyih. 


Layerl Layer2 Layer3 

(Input) (Activation/hidden) (Output) 



<jy<o5tffl<956yfl<o37 Qu(n ) dg,6£]d(& ) sitisisstli flti-i ld eusnaiudsv Q&tjd&uuQLb xO, aO 
LD^UL/aSlTbiaS units <oTSSI[DSS)Lpd,S>UU(^lS}SSI[DSSI. 


Input layer: ^Lpsv ^Lb^rhiansrr (Lp^svrrsu^j ^(^iddlsv 


3 ,frsmuu(^\Lb. 
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Output layer: ^sstsfld^uu^iid asvi—ld ^Qddlsv ^ssildilild. 


Hidden layer / Activation layer - $isvi_u51gv usvQsuru 

^(^)d(^3isrT 3,rrsmuu(^]Lb. (Lpg,<sv LD6$)(D(Lp3> ^Qddl&v ^Lpsv ^Lb&td&Gsvsn GSKSUtB&j 

2 -QT)GUnd>3>uu lLl _ G)&ujGVu(i))d)d)] l£> ^)<so(&)3><ob (activation units) &>rr<smuu(i)\Lt>. 

<S>1®{bd,(b)J5g, LDG$)(D(Lp3> <3](j))d!d)6V Q&iusvuQjBgjLb 

s,rrsmuu(^id. 


22.2 h(x) &6ssfluLj&<5ir tjtlaQgLb qSI&ld 

dLpdansmL- ui—{b$5l<5V s^suQsurrqf) ^GV^gnQrjjd^LDrTm <oTG$)i_a,6n 
Qs>rr(^ids>LjuLL(^isbsnssi. ^)si/_®«f)/D sipmoid (&,&>§}rjd>$dlsi) Qurr^^^l s^sijQsurrq^ 
<jy <sv @ d @ ld nssi h(x) asmddli—UuQdljog]. 

<oT6$)L _ 3iSlflsir LD§)lU iSl 6$)SftTU QUrT£)]£§] £g)<55)61/<95 sdiSST LD^LVL/ AND, OR, NOT 

QurrsirfD sfiltslaisfrlsiT ULq. ^s^ldil/ld. 



<oT®£§jd3irTll®d(3j -30, 20, 20 <oTg)//-D Z-Q@U/_/<ggi»6ff g(z) (&,&>§}rj^tslSV Ql//70<5^)l} 
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untjs> < 5 siyz_o. xl, x 2 z_d^>)ul /aim 0,0 £g) 0 / 5 < 5 / 7 <si> <o7w<o376i/0/_o? 0,1 

^)0^/7®i) <oT<oOT<S37Sl/0i_D? 1,0 LDfDgULb 1,1 LO£ 5 )L7L/<S<SY5<950 <o7W<o37 6 I/ 0 i_D? G>L//7<o377D<o5)61/ 
&6md!&U_Lju®!&]fD&il. 



AND: 0,0 <oTSSllLb(3una,l g(z) LD^IULI -30 <oT<S 37 <oT^\ljLDSS)jDil5]<sb ^SS)U)S]jD§]. 
(jjLDJBasmL- sigmoid GU6&)[JLJI_{b§?l6V -30 <oTSSIU§]0 <oT6$IU6$)&,&(&)$&(&)LD. 
<g£\si]6urrQ(D ^(/j)<5<5(/j)<30 ld^Iulisusyt ansmdSu^uu^lShsirjDS^T. £g)<5/D<95/7w 
AND -3>3>n®si truth table-^ Gp£§}($uu6V£d arremevrrLb. <5/761/0/ xO ld/d/tulL xl 
l-^a ^<o5)LDr58>n<sv ldlLQGld h(x) = 1 ®r> Q ei/syfl/_}/_/(£) < 50 //_o. 
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Weiglits = -30,20,20 

xl 

x2 

h(x) 

AND 

0 

0 

h(x) = -30.X0 + 20.xl + 20.x2 
= -30 + 20.0 + 20.0 
= -30 

0 

0 

1 

h(x) = -30.x0 + 20.xl + 20.x2 
= -30 + 20.0 + 20.1 
= -10 

0 

1 

0 

h(x) = -30.X0 + 20.xl + 20.x2 
= -30 + 20.1 +20.0 
= -10 

0 

1 

1 

h(x) = -30.x0 + 20.xl + 20.x2 
= -30 + 20.1 +20.1 
= -30 + 40 = 10 

1 


OR: -10, 20, 20 67gn/fi) ld£ 5 }u /_/ gasmen Cj(z) GIuitq^^^Iu urrrjdaGijLb. 

^Idjnj&nsftr zSytd-L _si/<S5)<5557" OR -d&nssi truth tclble-®o sp^^ltr^uussi^d ^nsmsomb. 

xO LDfD^JLD xl 1 -^■S <3l6&>LDr53,rT6V LDld@(]>LD h(x)= 1 ®£> Q GU <S 1 ?I LJ U ($) £3)] LD. 

xO ^jGVGV^J Xl ^jUGmi^sv gj^ugu^i g^gsttu 1 -^.s ^ssiLDrB^nsv 4L h(x) = l 

®£> QsuGfrluuQgBgjLb. 
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Weights = -10,20,20 

xl 

x2 

h(x) 

OR 

0 

0 

h(x) = -10.xO + 20.xl + 20.x2 
= -10 + 20.0 + 20.0 
= -10 

0 

0 

1 

h(x) = -10.xO + 20.xl + 20.x2 
= -10 + 20.0 + 20.1 
= -10 + 20 = 10 

1 

1 

0 

h(x) = -10.xO + 20.xl + 20.x2 
= -10 + 20.1 +20.0 
= -10 +20 = 10 

1 

1 

1 

h(x) = -10.xO + 20.xl + 20.x2 
= -10 + 20.1 +20.1 
= -10 + 40 = 30 

1 


NOT : ^usmi—msugg 2 _ehen 3-si/^y ^sv^rrsm^j NOT xl AND NOT X2 

QpGVLb asmd&i—UuQ&jDgt]. s^^rnsu^] NOT xl LDjDruLb NOT x2 ^rjsmLqsir 

LD§?lLj /_//_£) AND -^LpSVLC) LSsmt^lLC) <S<oSW<S® l—ULJ ®§il SSTfDSST. £g)&>&■* /T63T <oT<o5)/_ &&YT 1 0, - 
20 <oTSSTg}] 6V)LDILjLD. 


Weights = 10,-20 

xl 

X2 

h(xl) 

NOT(xl) 

h(x2) 

NOT(x2) 

NOT xl AND NOT X2 

0 

0 

h(x) = lO.xO -20.xl 
= 10 

1 

h(x) = 10.xO -20.x2 
= 10 

1 

0 

0 

1 

h(x) = 10.xO -20.xl 
= 10 

1 

h(x) = 10.xO -20.x2 
= -10 

0 

0 

1 

0 

h(x) = 10.xO -20.xl 
= -10 

0 

h(x) = 10.xO -20.X2 
= 10 

1 

0 

1 

1 

h(x) = 10.xO -20.xl 
= -10 

0 

h(x) = 10.xO -20.x2 
= -10 

0 

1 


<oTsstQsu ^6&)<sua&h s^sstp[)[ts> Q^rfrB^j Qld ro s>sm /_ neural network- As>nssi ld^Iuli 

iShsmsuQ^LDrr^j ^&s)LDiLjLb. 
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xl 

x2 

al 

a2 

h(x) 

0 

0 

0 

1 

1 

0 

1 

0 

0 

0 

1 

0 

0 

0 

0 

1 

1 

1 

0 

1 


22.3 Forward propagation 


Layer 1 : 

a = x 



0 = K 

Oi ] 

X — 

*0 

*1 

Layer 2 : 

a = g($. x) 


-*a- 

o = \e 0 

$1 $2 ] 

a = 

ao 

«1 

Layer 3: 

p- 

« 

II 

<Q 

a) 

. a 2 - 


(Lp^svrrsu^j ^(^idQsb sytstt Qaiu<sbu(^i^^iLb ^<sos>n&si§] (activation unit) 

^Lpsv ^Lb&ihi&mna, (raw feature S)<;9/<o5)/-Oir//_o. ^j^jGsu ^sns^Ld(^ida>rrssi 

^@ld. 

£g) ijGmi_rr6U8)rr&, ^snsn^i LDSS)(D(Lps, ^/(p<g5@. $)S>l<5V sshm Q&ujsbuQ^&jLb ^sv&rrm&j 
(Lp&>6Vn<5U§)l6V 2_<SY7W ^Lb&lhl&SfT LDfD^jLD ^j^sir <oT6$)l_&>6$)6fTu( Weights) QufTQJ^^I 
<31G$)LCHLjLb. 


&, 6 $)l_aF\lUrrG, ^SYTSYT^] QsUsdludld(^]d 3 ,rTSST <*>](£)&(&) <^,(&)Lb . $I§)I 6 V 2 - 6 rT 6 fT < 3 ] 6 V&,rT 65 T 3 )l 
LD6&)JT)(Lp&> <3l®d(3)g;®l?lGV 2_6Y7W <J>/ <S0(&)&>6n LDfOfr)JLC) ^^<dl CTOT)/_«OT)OTu(weightS) 
QurTgll^^l ^ 6$) LD ILj LD. 
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<g£\sij6urrQ(D s^si/Qsi//r0 <3i(^)dd]&2JLt> sshm Q^iusvu^i^^jLb <3isv(^3srflsiT ld^Ilj/_/ld 

<313,SS)ISSH _ UJ <oT<o5)L _ ILjld Q3[j[B^] <31 @153,(j))£53, <3]®d(3)3i6rrl6V 2-6rT6n <3]6V(3)3i6frl6iT 

LD^iu&nu ^rjLDrrGsfhjuQg} forward propa.cja.tion <oT<smuu(^)Ld. 


22.4 Back propagation 


hBLD3}1 neural network-si) 2_<SY7W <S£«jQ(Sl//70 <J>/<SU0<S0Z_O <oT6$lG)6$I<sZl65I <oT6$)L _ 3><a$)<afTL] 

UUJ6$IU(j)\3,dll65m<sb, 3,6U(r)]3>6$)6fTd (3)6&)JDd3;6VrTLC) <oTSSld 3,6$5T(i)\LS}lS1-UuG3, back 

propagation ^(^id . s^sdQsunq^ < 3 ](jddd)G£iid rffsiQ-pm 3)Gi]G$)(Dd 3<sm(^]iSlL^dsi 
<31£5<sir partial derivative ld^Iuli^syt i3<di<ssf](3,rB^i (Lpmsmrr&d 
3>smdSu_uu(^iS}ssijD<3si. iSlsirsisTij <3i6&)<su3;<5&)<sn spsirru ddrjidui. <3i[53, network- ssr COSt 
3,smQ)i3L^d3^uu(d)^rD^]- G)u3§]su33, gradient descent algorithm -^ssr^i 

(3)6$)(Dr5£5 < 3 ] 6 YTG 11 cost GlGUSlflUUI—d 3^1^113 SUSS)3uS\sb neUrOn-<956yfl<o57 <oT<o5)l _ SS)UJ 

<3is$)L£>d3, ^[53, back propagation -<sdu uiumu^i^^diro^i. 
delta = error of each node in the corresponding layer 
Layer 3 : delta3 = h(x) - y 
Layer 2 : delta2 = theeta T .delta3 .* a .* 1-a 
Layer 1 : delta 1 = theeta T ,delta2 .* a .* 1-a 
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where g'(z) = a .* 1-a = This is g-prime. = derivative of the 
activation function g 

. * = element-wise multiplication 


n 

i)0 v 


.1 


111 


( 


Accumulator matrix 


) 
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23. Perceptron 


Perceptron <ot&$tuG a, neural networks-<s<g 5 / 7 <o 37 ^/^_/j/j<o 5 )/_ 

GrBrjGarr® (npevid iSlrflda eusvsv ^tjsii^^d^nssi binary classification 
algorithm ^ssmeb logistic regression Qurrm^j ^ssr§] ®/D/D«r><su 

<jy<o5)/_o<*<95/7g/. rffg^rrmsirioTeijeurTrii Q&rr^&id Q®/7(6j®z_o/7® arDryd Q^rrsnQroQ^rr 
< 3 )ff) 6 $) 6 $T < 3 im-LiuG$)i_ujrrg, ssxoLifb&j, uuSIjd d)£5 ^/je^ans^enu urorflu ULSj.uuL^iurT3id 
ai/orryd Q&rrsrrQrD&j. d^daismi— <oT<Y)^^]d3imdL^6V 4 uudjodl^ g,rj«yg;6rT 
QarrQdauuiJ-QisrT&nsKr. <3i^lsbxl, x2 <oT<ssiiLb 2 features-®? ssisu^^i 0 <3i<svsv^j 1 

67go//i) gug$)&,u5\g$t SdLp <3)G$)LDiLjLb grjsii&sh uu5l/D&d(&) 2_snsnm. 

xl , x2 , y 
[0.4 ,0.3,1], 

[ 0 . 6 , 0 . 8 , 11 , 

[0.7 ,0.5 ,1], 

[0.9 ,0.2 ,0] 

Neural Networks srsiru^j (3 rBrn^iuira ajnrjyd QarT&rr&nrTLD&v ^6&>i_u5)gv usv 
activation units-ga 2_0<si//7®® < 3 i^mL^uus^>i_u 5 ]sv ®/D/r)V<® Q&rrsn^Lb <oT<sst^i 
sjfoGlaisifrQsu unrjd>Gg,nLb. features-g?ti//i) weightS-®?ti/fi) 

^jssxsm^^j QrBrjL^iunai hypothesis-®?.® ®_/D/ry<® Qarr&hGnnLD&v, £g)«f>/_u5)<sii> 
activation unit-®?.® 3ismdd](Y)S]fD^] • iSlssrssrrj^Ldm^luiShsir< 3 iL^uu&r>i_uSlsv 
< 5 / 7 <si/®(SY 5<®0 tpjBjDrrrj Qurrsirri] weights-®? urn/nd)! pfikunetr (Lp®s)[DuS}<sb &jD(g]d 
Qs>nsnd]p[)^i. iSleirsuq^LDrrQj. parameters <oT<s$iuQ&, $}di(3) weights <oTsst 
<31 ss) Lp d ® u u Q)i dl jo • 
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https ://aist. aithub. com/nithvadurai8 7 / 

e6794ec008a7855681db4ba9164b54af 


def predict(row, weights): 
activation = weights[0] 
for i in range(len(row)-1): 

activation += weights[i + 1] * row[i] 
return 1.0 if activation > 0.0 else 0.0 

def train_weights(dataset, l_rate, n_epoch): 

weights = [0.0 for i in range(len(dataset[0]))] 
for epoch in range(n_epoch): 
sum_error =0.0 
for row in dataset: 

error = row[-l] - predict(row, weights) 
sum_error += error**2 

weights[0] = weights[0] + l_rate * error 
for i in range(len(row)-1): 

weights[i + i] = weights[i + 1] + l_rate * error * row[i] 
print('epoch=%d, error=%.2f' % (epoch, sum_error)) 
print (weights) 

dataset = [[0.4,0.3,1], 

[ 0 . 6 , 0 . 8 , 1 ], 

[0.7,0.5,1], 

[0.9,0.2,0]] 

l_rate =0.1 

n_epoch = 6 

train_weights(dataset, l_rate, n_epoch) 


rfipejiidarresr Qsusfiid®: 


epoch=0, error=2.00 
epoch=l, error=2.00 
epoch=2, error=2.00 
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epoch=3, error=2.00 
epoch=4, error=1.00 
epoch=5, error=0.00 
[0.1, -0.16, 0.06999999999999998] 

a68Brd&®a6rr rflaqgiL aS\^ih'. 

(Lpg,&frl6V Q<g5/7(5)<®>s/-j , /-/L-(5)sv7'sv7'fG9.tureS-iSiy/_<SOT £g)«o«rara5<s/_}/_//_ QshismL^iu 
Weights-«S7 ID^UUM 0, 0, 0 <6T65TU6$)a> 6$)6U{£3)1 3,6513)1 3iJD IT) 6$) 60 Qd,ni_rhl(&)§>)fOd)] . 

(Lp3, <SlS) 60 ^SYTSYT L^^llULD, X0 6T6$)ILC) biaS Unit-<*<95/7(o37 LD§?I LJ U ma^LD. £g)/5<5 biclS 

Unit (oTuQurr^jil) 1 <orss)iLb ld^Iu6$)uQuj GlujDj^(r^d(^id gtgst GjfDQ&GsrQsu urrrj^Q^rrLb. 

<3j($)153)(])){53)] 2-SYTSYT y®Og5)UJ/EV<956)7 Xl , X2 "<9595(7637 weights LD^IUUn(^LD. ^jSUJDSSlJD 

6$)6ud,3)i iSl<sir<su(rTjLC) sunujuunid l^gsj ^Lpevid (Lp^sv < 5 / 761 / 9595/7657 [0.4 ,0.3 , 1 ] 
activation unit &6md£)i_uu($)£)rD§j. ^j^iQsu heaviside activation 
function gtgsi^] ^6$)Lpda,uu(^\S]jD^i. Sigmoid G/j/tw/q/ £g)dg/ LDjnQjDrTQTj 61/65)95. 


Activationunitl = wO.xO + wl.xl + w2.x2 
= 0(1) + 0(0.4) + 0(0.3) 

= 0 

if Activation unit > 0, Predict 1 
else Predict 0. 


£g)6i/6i//7/ry asmi—ffirsa, ld^Iuli, 0-®£> ei5)/_ ^{slamma; ^(rTjrBgrreb 1 676576i//_o, 

^j606$)6oQuJ6$fl6b 0 6765761//_0 predict Q&ilhlild. 0 srm predict Q&iuu-iLb. 

^ssrrrsb uudl/odl^ <5/76i5)<si> 1 <o7w QairrQdanjuLdQerTsrTd}]. £g)6i/6i//7/ry uudl/odl^ djijefileb 

2 - 6 iT 6 n ld^Iuli, activation unit 95655fl<5<5 ld^Iul/i^gst Gp^&ju 

Gju n&sfil 606 $)gvG)iligsFIgv^ 1 != 0) weightS-65i ld^IuiSIgsight LDn/nifl < 3 j(j)){ 5 d, a>rj6iid(&) 

uudljod) ^ 9 /sy flda, (SsuemQLb. iSlGSTGUtajid GuniLnjumdispGST ^Lpeoid li^Iuj weights 
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3,smdSU_uu(^)S}tD^i. 


wO = wO + learningrate * (actual-predict) * xO 


$l$ 5 i<sb 6p<si]Q<surTQTj weight- ld &,6$i6$)i6$)i—uj uss)ipuj ld§?i u l/iskt learning rate-gc><® 
3x*-L_($)&\(D&ii. learning rate <orsiru^i gradient descent-su fBmb 
uuj6$iu(i)\{f,3)i&}6$irD Lb$5luiS]6$)65i 6p£g,(3d, ^(^ld. ^j^nsu^j update- sir ^errsunsifr^j 
£g)/ 5 <$ learning rate ^Lpevid aidQuuQ^^uuQdifnn)]. ld^iuli 0.1 <otssi 

6$)6l]3,3,UUL-(])\6rTGfT3}l. Lfi}&>d : §)P$}UJ <3l6rT<sfil&V £g)<5«QV<o5)/_UJ Weights, adjllSt 

Q&iLnuuuL- Qsusm^id <oTG$tug$)&)(3uj £g)^/ (^f^dQro^j. iSissrssnj 

Gl^rrss)3,iL\i_ssT 2_smss)LDiunm LD^lund^id - mswsfluLid^Lb 2_<sy7w QsurnumdL^sir 

ld§)I U Li ld, weights ^<^<osws<95/j/jL@6y7wfeatures-«s7 ld^iulild 

Qu(rrjdg;Lju®§>)fn3)]. ^jsiisunjDnm li^Iuj Weight-«S7 ld^Iuli msmddU_uu(^dljD^j. 


$][ 5 J 5 6unujUunid<o5)L _ u uujGsiuQdjd^d &6md&>U_Lju /_!_/_ Weights-«S7 LDdhuLiam 

iSlsinsuq^LDnru. 

wO = 0 + 0.1 * 1 * 1 = 0.10 

wl = 0 + 0.1 * 1 * 0.4 = 0.04 

w2 = 0 + 0.1 * 1 * 0.3 = 0.03 

£g)<gj<5«o<95UJ li§)Iuj weights-^u uiusirud^^l 2-si/^/ ^rrsi^dn>nssi [0.6 ,0.8 ,1] 
activation unit LSlGSTGiJQjjLnnjQj aGmddli—UuQdtDgil- 

Activation_unit_2 = wO.xO + wl.xl + w2.x2 
= 0.1(1) + 0.04(0.6) + 0.03(0.8) 

= 0.1 + 0.024 + 0.024 
= 0.148 

^)/E7@ 0-®» siS)/_ ^^ln>LDna, ^(njuua,n6V 1 <oTssr predict Q&uhu-iid). uudl/ndg, 
&n<sfil6ViLD 1 <oTssr s_erreir§]. <=§/,« (3 <si/ WeightS-g> LDnfDfDnuxsb 3-gi/^/ ^rrsiidmmsin 

[0.7 ,0.5 ,1] activation unit 3i<smddL^uu(£\drD§i. 
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Activation_unit_3 = wO.xO + wl.xl + w2.x2 
= 0.1(1) + 0.04(0.7) + 0.03(0.5) 

= 0.1 + 0.028 + 0.015 
= 0.143 

^rhi(&)Lb 1 <o 7 w predict uuSlrndld &naS}spiih 1 <67657 ^aOsi/ 

weights-®? LDnjDfonLDsv 4-si/^/ < 5 /jsiy■*< 95 / 7 ( 537 [0.9 ,0.2 ,0] activation unit 

&Gm&Qi_iju($)Q(D3)i. 

Activation_unit_4 = wO.xO + wl.xl + w2.x2 
= 0.1(1) + 0.04(0.9) + 0.03(0.2) 

= 0.1 + 0.036 + 0.006 
= 0.142 

wO = 0.1 + 0.1 *-l * 1 = 0.0 
wl = 0.04 + 0.1 * -1 * 0.9 = -0.05 
w2 = 0.03 + 0.1 * -1 * 0.2 = 0.01 

£g)/EV(g 1 <67657 ^657/761) 655765) /_0tz5)<si> 0 GTSKT S-SrTGtT&J . <67657(361/ LSsmQw 

weights asmddli—uu @5,1/03)]. ^jsijsun/ons, Gl3,n@d3,uuLd@sn<sn 4 uuSlrodld, 

3>rj<si]3>6i?l&v 2 &rfhurrg; 3,smf\d3,uuid(@snsnff ) ], 2 ^su/orrs, 3,smf\d3,uuid@lsnsn§,]. 
£g)<g|d£//_657 (Lpg,sb epoch (Lpu/5jo^]. epqrj Sr/o/ftdb ^ss)ssr^^iu uuSl/odld, 

< 5 / 761 /<95(OTj/_D G>/_Dfl5<9>6557/_ (3<977<565)657<S(g 2 _L_L/ @^UUiL@, algorithm 3,/D^jd 

Q <95/76Y761/65)< 5 0 UJ 1 epOCh <67657®(3/D/7/_D. /_S)65761/0/_D/7ffy. 
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Epoch = 0 

xl 

x2 

y 

weights 

activation units 

predicted_y 

updated weight for next row 

If y != predicted_y 

0.4 

0.3 

1 

0,0,0 

0*1 + 0*0.4 + 0*0.3 
= 0 

0 

0 + 0.1 * 1 = 0.10 

0 + 0.1 * 1 * 0.4 = 0.04 

0 + 0.1 * 1 * 0.3 = 0.03 

0.6 

0.8 

1 

0.1, 0.04, 0.03 

0.1*1 + 0.04*0.6 + 0.03*0.8 
= 0.1 + 0.024 + 0.024 
= 0.148 

1 


0.7 

0.5 

1 

0.1, 0.04, 0.03 

0.1*1 + 0.04*0.7 + 0.03*0.5 
= 0.1 + 0.028 + 0.015 
= 0.143 

1 


0.9 

0.2 

0 

0.1, 0.04, 0.03 

0.1*1 + 0.04*0.9 + 0.03*0.2 
= 0.1 + 0.036 + 0.006 
= 0.142 

1 

0.1 + 0.1 * -1 = 0.0 

0.04 + 0.1 * -1 * 0.9 = -0.05 

0.03 + 0.1 * -1 * 0.2 = 0.01 


(Lp3,<sv epoch -sir asmiSdudlsv L/ig^/ra aernddli—Uu l_l_ ld^Iuli^Qisyt epoch- 

sst uu51/D®d> ^fjsi^s,^n,i—ssr G&rfd)^ uiusiru(^^^Lju(^)d]fD^]. ^jsijsurTfDrrai 6 (Lpss)(D 
epochs &6md§>li_Lju®§>ljD3i]. iSlsirsu^LDn^j. 


Epoch = 1 

xl 

x2 

y 

weights 

activation units 

predicted_y 

updated weight for next row 

If v != predicted y 

0.4 

0.3 

1 

0, -0.05, 0.01 

0*1 + -0.05*0.4 + 0.01*0.3 
= 0 + -0.02 + 0.003 
= -0.017 

0 

0 + 0.1 * 1 = 0.10 

-0.05 + 0.1 * 1 * 0.4 = -0.01 

0.01 + 0.1 * 1 * 0.3 = 0.04 

0.6 

0.8 

1 

0.1, -0.01, 0.04 

0.1*1 + -0.01*0.6 + 0.04*0.8 
= 0.1 + -0.006 + 0.032 
= 0.126 

1 


0.7 

0.5 

1 

0.1, -0.01, 0.04 

0.1*1 + -0.01*0.7 + 0.04*0.5 
= 0.1 + -0.07 + 0.02 
= 0.1 

1 


0.9 

0.2 

0 

0.1, -0.01, 0.04 

0.1*1 + -0.01*0.9 + 0.04*0.2 
= 0.1 + -0.009 + 0.008 
= 0.1 

1 

0.1 + 0.1 * -1 = 0.0 
-0.01 + 0.1 * -1 * 0.9 = -0.1 

0.04+ 0.1 *-1*0.2 = 0.02 

Epoch = 2 

xl 

x2 

y 

weights 

activation units 

predicted_y 

updated weight for next row 

If y != predicted y 

0.4 

0.3 

i 

0, -0.1, 0.02 

0*1 + -0.1*0.4 + 0.02*0.3 
= 0 + -0.04 + 0.006 
= -0.03 

0 

0 + 0.1* 1 = 0.10 

-0.1 + 0.1 * 1 * 0.4 = -0.06 

0.02 + 0.1 * 1 * 0.3 = 0.05 

0.6 

0.8 

i 

0.1, -0.06, 0.05 

0.1*1 + -0.06*0.6 + 0.05*0.8 
= 0.1 + -0.036 + 0.04 
= 0.104 

1 


0.7 

0.5 

i 

0.1, -0.06, 0.05 

0.1*1 + -0.06*0.7 + 0.05*0.5 
= 0.1 + -0.042 + 0.025 
= 0.083 

1 


0.9 

0.2 

0 

0.1, -0.06, 0.05 

0.1*1 + -0.06*0.9 + 0.05*0.2 
= 0.1 + -0.054 + 0.01 
= 0.056 

1 

0.1 + 0.1 * -1 = 0.0 
-0.06 + 0.1 * -1 * 0.9 = -0.15 

0.05 + 0.1 * -1 * 0.2 = 0.03 
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Epoch = 3 

xl 

X2 

y 

weights 

activation units 

predicted_y 

updated weight for next row 

If V != predicted v 

0.4 

0.3 

1 

0, -0.15, 0.03 

0‘1 + -0.15*0.4 + 0.03*0.3 
= 0 + -0.06 + 0.009 
= -0.051 

0 

0 + 0.1 * 1 = 0.10 

-0.15 + 0.1 * 1 * 0.4 = -0.11 

0.03 + 0.1 * 1 * 0.3 = 0.06 

0.6 

0.8 

1 

0.1, -0.11, 0.06 

0.1*1 + -0.11*0.6 + 0.06*0.8 
= 0.1 + -0.066 + 0.048 
= 0.082 

1 


0.7 

0.5 

1 

0.1, -0.11, 0.06 

0.1*1 + -0.11*0.7 + 0.06*0.5 
= 0.1 + -0.077 + 0.03 
= 0.053 

1 


0.9 

0.2 

0 

0.1, -0.11, 0.06 

0.1*1 + -0.11*0.9 + 0.06*0.2 
= 0.1 + -0.099 + 0.012 
= 0.013 

1 

0.1+ 0.1 *-1 = 0.0 
-0.11 + 0.1 * -1 * 0.9 = -0.2 

0.06 + 0.1 * -1 * 0.2 = 0.04 

Epoch = 4 

xl 

x2 

y 

weights 

activation units 

predicted_y 

updated weight for next row 

If v != predicted v 

0.4 

0.3 

i 

0, -0.2, 0.04 

0*1 + -0.2*0.4 + 0.04*0.3 
= 0 + -0.08 + 0.012 
= -0.068 

0 

0 + 0.1 * 1 = 0.10 

-0.2 + 0.1 * 1 * 0.4 = -0.16 

0.04 + 0.1 * 1 * 0.3 = 0.07 

0.6 

0.8 

i 

0.1, -0.16, 0.07 

0.1*1 + -0.16*0.6 + 0.07*0.8 
= 0.1 + -0.096 + 0.056 
= 0.06 

1 


0.7 

0.5 

i 

0.1, -0.16, 0.07 

0.1*1 + -0.16*0.7 + 0.07*0.5 
= 0.1 + -0.112 + 0.035 
= 0.023 

1 


0.9 

0.2 

0 

0.1, -0.16, 0.07 

0.1*1 + -0.16*0.9 + 0.07*0.2 
= 0.1 + -0.144 + 0.014 
= -0.03 

0 



Epoch = 5 

xl 

x2 

y 

weights 

activation units 

predicted_y 

updated weight for next row 

If v != predicted v 

0.4 

0.3 

1 

0.1, -0.16, 0.07 

0.1*1 + -0.16*0.4 + 0.07*0.3 
= 0.1 + -0.064 + 0.021 
= 0.057 

1 


0.6 

0.8 

1 

0.1, -0.16, 0.07 

0.1*1 + -0.16*0.6 + 0.07*0.8 
= 0.1 + -0.096 + 0.056 
= 0.06 

1 


0.7 

0.5 

1 

0.1, -0.16, 0.07 

0.1*1 + -0.16*0.7 + 0.07*0.5 
= 0.1 + -0.112 + 0.035 
= 0.023 

1 


0.9 

0.2 

0 

0.1, -0.16, 0.07 

0.1*1 + -0.16*0.9 + 0.07*0.2 
= 0.1 + -0.144 + 0.014 
= -0.03 

0 



6-sugj epoch-si) &,rr6$i, ^ 9 /u uuSl/n&g, &,rj6n&>(Qn)Lb &fihurrg; 
<95SS^<S<95/j/j(^®SOT/DS57. (oTSSlQsU WeightS" g>(3uj /_S)/D<SfTSl)<5 ^[JS^3i6S)<SYT 

3>6$$?luu&,(D3>n65i algorithm- ss7 weights-^* r5mb <6TQamsrrsrTsvrTLb. 

sii6iirrtons, (Lpg,sv <5/7<si5)ssr a&ssfluLi gfliurrai ^q^rB^rrsb, ^tjeiSlffx^d Q&sbspiLD. 

^sb&nsvGULHsisflsv weights-^* LSsm^id asmddlid® ^rj&SljD(^d Q&svsvilc). 
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3,smf\Ljn3,sn ^ssxssT^^jLb &rfUurTg; rff&(LpLt> <su&r>rr £g)G><$ (Lpss)(D iSlmu/D/Duu^hsu^rrsv, 

g$l§i error-driven learning algorithm ot<s%/ ^GmLpdauuQSifDgi. 
^ff,®sTi<iuu<s$)L^itii<x> ^ss)wSussTfD MLP (Multiple Linear Perceptron) 
<otg$tu(S&, neural networks-®? 
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24. Artificial Neural Networks 


s£0 rflg^rjnsir &jnig]d Qs,rrsnsuss)^ ^L^uus^i^iurrai ss)su^^i 3,ro^jd Qairrshsu^] 
perceptron STsir/Drrsv, uevQsur^j f§l^ > ijrrssTS,ss)snd Q&rrsmL- ldss?]^ ^Lpsm&rr ajnrrijd 

G)& n< syt< susin &) ^L^uu<5&)i_uj /7<® <o5)si/<50/ 3>jD^jd QairrsYTsu^j Multi-layer perceptron 

^(^ld. ^<5/7610/ Q&ujsb&ssxsn <3]isi-Lju6$)i_ujrr3,d Q&rrsm® rffg^urrmaush aijDdlsirjDsm. 
f§l^j ) fjrrssTS,sn 3,rognd Qamsmismg, <s5)si/<50/ ldss?]^ Qpssxsn aiindlfD^]. £g)Gk5 (Lp&D/Dudlsi) 

^tjsy&GsxifT<3]isi-Lju6$)i_ujrr3,d Q<95/7<o3k(5) perceptron .s/D®<S577D<o<57. Perceptron- 
(off) (oYT (o3) GU<EF) <EF)I directed acyclic graph-g> ^surrddi MLP atrjgitosi. @i§iG> su 
Artificial neural network <or<sm^j ^s^iLpd^u u<yiSiir)^j. 


Perceptron srsiru^] QptjQ&rr® ^LpsvLCi iSlrjld&dg^Lq-UJ ^/7siy<s«f)SY7 <si/<J5)<95/_}/_/(£) <s<g 
2_<5<si//_o <o7«J7^y <o?idQ 3 > 6 $iGgi] umjd,Cp 3 ,rTLb. Non-linear (Lpeminudl&v ^s^LDrB^jshsrr 
{f,rjGH&> 6 $) 6 iTU iSlfiuuJBjDig) MLP-solv uuj 6 $iu(i)\d,{f,<sonLb. <oTssrQsu^nssi universal 
function approximator <S 76 %/ < 3 t<ss)Lpd&uu($)£)rD§j. g)0/<rf/7kernalization 
<oTssijD 3)d>g}iGU(LpLC) non-linear (Lpssifoudlsv ^ssiLDrB^jsrrsrT ^ijs^3>ss)snu iSlrfluLjg,in(g ) 
2 _<g 6 iy/_o. ^smgjUUfDd)) SVM Grsirin u^^luSlsv ^j(dGI3,sstQsu urrrjd,^] <si5)t1(3z_/7z_o. 


dL^d&smi— <oT(G^^]d3imdL^6V, 16 uuSl/odl^ ^fjsi^3,sn Q&rrQd&ijuL-QsrTGfTm. X -<si> 

2 features-i7), y-ev ^ss>su <oTrF>3, suss)s>u5]ssi dLp iSlrflda GsusmQLD 
<oT6$)ILC) sSlSUfTfLOLD Uu5\[f)&\d(&> ^Slfld&UUldCfimmSKT. 1,2,3 <oTSS)lLD PUlSSTfOl GllG$)g,&,Gtr}G$T 
d>Lp g,rjffLig;m L^rfld^uutyisu^rrsv multi-claSS classification - <EF)<EF> IT (ofiT 


https ://aist. aithub. com/nithvadurai8 7 / 

b95e0ccd56464646da32ffdddb8b457f 
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from mlxtend.classifier import MultiLayerPerceptron as MLP 
from mlxtend.plotting import plot_decision_regions 
import matplotlib.pyplot as pit 
import numpy as np 

X = np.asarray([[6.1,1.4],[7.7,2.3],[6.3,2.4],[6.4,1.8],[6.2,1.8],[6.9,2.1], 

[6.7, 2.4], [6.9,2.3], [5.8,1.9],[6.8,2.3],[6.7,2.5],[6.7,2.3],[6.3,1.9],[6.5,2.1 
],[6.2,2.3],[5.9,1.8]] ) 

X = (X - X.mean(axis=0)) / X.std(axis=0) 

y = np.asarray([0,2,2,1,2,2,2,2,2,2,2,2,2,2,2,2]) 

nn = MLP(hidden_layers=[50],12=0.00,11=0.0,epochs=150,eta=0.05, 

momentum=0.1,decrease_const=0.0,minibatches=l,random_seed=l,print_progress=3) 

nn = nn.fit(X, y) 

fig = plot_decision_regions(X=X, y=y, clf=nn, legend=2) 
pit.show() 

print('Accuracy(epochs = 150): %.2f%%' % (100 * nn.score(X, y))) 

nn.epochs = 250 
nn = nn.fit(X, y) 

fig = plot_decision_regions(X=X, y=y, clf=nn, legend=2) 
pit.title( 1 epochs = 250') 
pit.show() 

print('Accuracy(epochs = 250): %.2f%%' % (100 * nn.score(X, y))) 

pit.plot(range(len(nn.cost_)), nn.cost_) 

pit.title('Gradient Descent training (minibatches=l)') 

pit.xlabel( 1 Epochs' ) 

pit.ylabel('Cost 1 ) 

pit.show() 

nn.minibatches = len(y) 
nn = nn.fit(X, y) 

pit.plot(range(len(nn.cost_)), nn.cost_) 

pit.title('Stochastic Gradient Descent (minibatches=no. of training 

examples)') 

pit.ylabel('Cost 1 ) 

pit.xlabel( 1 Epochs' ) 

pit.show() 


uuSlfD £)<§5 emend Qarressr® MLP-<95(g/j uud/nd) ^errld^idQurr^], 
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iSlmeuQ^LDrrrn ^fjsi^3,GS)smj iSlrfldldlfDgl. (°£j§>i<sv 1 <oTgn//i) suss)3,uS\sst did iSUfldaiuuL- 
QeusmL^iu^] ^rfliurrai ^s^LDiurrLCisv, 0 <oTgn/fi) suss)3,u5lssrdip 

iS)rfld&uuL-Lq-(rijLjuGis)g,d guT&mevmti. 



(oJsstQsu MLP-<950l} uudljod) ^sifld^idQurr^j Q&irQd&uuLdQsksn epochs -ssi 

<5T6m&s$fld&r>a&r>iLi 150 - 615 ) 0 / 5 ^/ 2 50 - <oT<sst LDirrorfl uudlrodl ^erfl^^ju urrrfdans^Lb. 
^uGurTgi < 3 ] 6 $)G$Tf£&)i{£ ^fjsi^3,^n,Lb &fihurrg; suss)3,uu(^^^ijuidin.([^ijuss)^d 

a^nemsoruii. 


nn.epochs = 250 
nn = nn.fit(X, y) 

fig = plot_decision_regions(X=X, y=y clf=nn, legend= 2 ) 

plt.title('epochs = 250 ') 

plt.showQ 
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ctiswCS si/<5/7(557 150 <oT^]mGurrg] ^pesr accuracy 93.75% <o7S57si//i), 250 

<o7OT)y/_DG/J/7^y ^ 9 /<gS57 accuracy 100.00% <o7S57Sl//_0 2_//J/7/5^)0L7L/S5)^<® <95/7S557<SU/7LO. 

Iteration: 150/150 | Cost 0.06 | Elapsed: 0:00:00 | ETA: 
0:00:00Accuracy(epochs = 150): 93.75% 

Iteration: 250/250 | Cost 0.04 | Elapsed: 0:00:00 | ETA: 
0:00:00Accuracy(epochs = 250): 100.00% 

<jy(/J)<5<5<5/7<s spsijG)<surTQTj epoch-gy/i) ^/^swCOSt ld^Iuli <o7si/si//7/7jy @s5>/D®/D5y 
(oTGSTU&J GUGIDIJUL _LD/7<$ Sl/S5)/7/5^/ <55/TZ_7_T_Z_7 U (J) QfD&J . MLP~<950/7 UuSljDd) 

^i&rfidi^LbQun^i G)arT®daLjuLd®<5rT<sn parameter- asifi&v 6£S57 /d/7 S57 minibatches 
-sir ld^iuli 1 <o7S57^/s5)/_o^^/7si) gradient descent (Lpem/nudisv ^rjs^^^d^u 

uuSljod) ^srrid^Lti. <g£\6$)&)iJ L/ i a5 i a5)^<5/7S57perceptron-<5i) aifDQromdi. 
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Gradient Descent training (minibatches=l) 



minibatches -sir ld^Iuli Q&rrQd&uuLLQsnsn LDrrdilrfld) ^rjsi^s>sdl<dr 
<oTsmswsfJdss)3iUjrT3i ^jssiLDrB^rrsv, Stochastic gradient descent (LpsmfBuSlsv 

<5/761/<95<6Y5<950L7 L-IU^JD® <S>1 ®l?l d (&) LD. <J>/<5/761/0/ SpsijQ 61//70 <5/761/<95 6Y7/7<95 

gradient (Lpsminudlsv cost ld^Iussiu (^sisr/o^^j surjrrLDGV, G)Limd,&,LDn3> ^s$)s$rd,3)iu 

uuSl/odlfs <5/761/ <95 s$)smLjLb ST(^)^^]d Glarrsm® (^sm/Drsis COSt-g><95 3ism(^)iS)L^dai 

s_<56i/6i/(3<5 stochastic gradient descent (oTssrQsu^rrssT dLpd&smi_ 

sus&>rjLjui_d>S>lGV, spsirGlsurrcm epoch-sy/zii ^&s$)is$)i_uj COSt ld^Iuli ^ssisist^^ju 
uuSl/odlfs <5/761/<95 ss)smLjid Q&tjib&j sisssrdd] l _ uu(^)su^nsv, ^ss)su zig-zag suL^sdlsv 

<3]siftLCir5&>l(rTjuu®s)g,d airrsmsvmd. i£\a 5 ^^<95 ^gyeirei/ srsmsmsfJd&naiuSlsv uuSlrodl^ <5/76i/<956Y7 

^)0<95@Lo(3L//7g/ / gradient descent ^jssisu ^ssism^sisr^iLiid spsirGlsunssr/orrai 
^rrmurB^j global optimum Q&ssrjns&ii—Uj !.£)(&)!£&, GrBrjid idhs^d^id. ^ssianurrsv, 
(SunssrjD ^(r^smrhi3,srflsv StOChastic-sou uiussru^l^^svmd. ^j^jQsu batch 
gradient descent srsirruid s^^Lpdanju^iSiro^i. 
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25. (LplSf.Gl]<o5)[J 


^iur5§?i!jGiJLfld a/ojDsv (LpLq~G>J6v>i—[53)] stii—sflsv-sinsv- Deep Learning, 
AI, Neural Networks <o7w/ry u si) G surry q^s&iwam asvsflsisfl 2 _go§>1gv 

61/0®<o57ff)(o37. <3]6U(D6$)(D ^6V>6mUJff>§?l&V ®OT)Z_<*0Z_D Z_//7Z_/5V<956y7, SU&THSVLJU^l 61/<95(S)T, 

StackOverFlow GuneirjD ^mrhi 3 i<sh, s>nQsmn&ffls>sb sul^Quj Q^ni^rfrB^j s>pr)Qj <surj 
Qsusm(^)QQfDssT. 


fbssrrSl 
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26 . 


iSisir si/( 75 z_d YouTube Playlist 






https://www.voutube.com/watch? 

v=iHG 8 We 5 8 HVY&hst=PL5 itdTO 7Pm8 wxRaPWlj PntnBmnOs4Ex 

DM 


Ml-01 introduction to machine learning in tamil - ^lurB^irj suLfd 
ajDioeb- @r/B https://www.voutube.com/watch? 

v=iHG8We58HVY 


ML-02 Introduction to Machine Learning Algorithms in Tamil 
http s: //www. you tub e. c om/watch ? v=AYM uTO 5 i4 aE 

ML-03 Pandas - ^0 - Introduction to Pandas in Tamil 

https ://www, youtube. com/watch?v= 1 j rK84iZv7 g 

ML-04 Machine Learning Model Creation in Tamil 
https://www. youtube. com/watch?v=Nz 6 iTOZli-k 

ML-05 Machine Learning Model - Prediction - in Tamil 
https://www. youtube. com/watch?v=05HMDKepzRc 

ML-06 Feature Selection - Manipulated variable - Disturbance 
Variable 

https ://www, youtube. com/watch?v=H 8 5 tTHHFMw 
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ML-07 Feature Selection - Process Variable - RFE Technique - In 
Tamil 

https ://www, youtube. com/watch?v=Dvq 1K2 4vl so 

ML 08 - Machine Learning in Tamil - 08 - Improving Model Score 
https://www. youtube. com/watch?v=6clvCfFI6qI 


ML 09 - Machine Learning in Tamil - Outliers Removal 
http s: //www. you tub e. c om/watch ? v = SfBNvnpsovO 

ML 10 - Machine Learning in Tamil - Explanatory data Analysis 
https://www. youtube. com/watch?v=SliSuYT-xiU 

ML 11 - Machine Learning in Tamil - Simple Linear Regression 
https://www. youtube. com/watch?v=OB36E9wlPI 

ML 12 - Machine Learning in Tamil - Gradient Descent 
https ://www, youtube. com/watch?v=_3Cfw2 gmOhl 

ML 13 - Machine Learning in Tamil - Multiple Linear Regression 
https ://www, youtube. com/watch?v=E CK4bj IrWj w 

ML 14 - Machine Learning in Tamil - Polynomial Regression 
https ://www, youtube. com/watch?v=8dTML0Xvzro 
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ML 15 - Machine Learning in Tamil - Feature extraction using 
vectors 

https ://www, youtube. com/watch?v=-Xktzn9XxG a 

ML 16 - Machine Learning in Tamil - Natual Language ToolKit 
https ://www, youtube. com/watch?v=yZLG5hOIvPM 

ML 17 - Machine Learning in Tamil - Logistic Regression 
https ://www, youtube. com/watch?v=dXEnj S 7Xj qs 

ML 18 - Machine Learning in Tamil - Multi class classification 
https://www. youtube. com/watch?v=R_lXGh01EoA 

ML 19 - Machine Learning in Tamil - Neural Networks 
https ://www, youtube. com/watch?v=8pOBrF7bfqs 
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2 7 . L$p L£}<S5T6$)[T<sb&<ofT 


http://freetamilebooks.com/authors/nithyaduraisamy/ 
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28. &<SVsfluJLb UfDffy 


<SO<S 0 <SOT 

<SL_L/D/D S«f5fljyL_/J<S^)<S57 (oTGTTJuj <Sl5)<SY$/U_//EYSOT Qgm—[hJ§il <3]fi)ll]jlL-ULDrrG$T <J>//_0S/EYSOT 
<5U6V)IJ ^fSl [B^l I— <o)S] <o5) Lp U-l LD <oTSU(r^d(^LC) (5<g <S5) 61/ UJITSST <g SOT (SO SOT) OT Q<5/TZ_/rs£)u_/(TS<g 

<50z_o 5>OTZ_o niu 2_0(o)L/^y<si/0/. 


• 2_«f>/7, <S£6z5), ffliayS <oTSST LJSVSVfTL <S 61/<Si»SS<5yK(5p/zi) <Sl57 <51//775Z S OT S0<S1/S/. 


• ^£gtf6$)(Du5lG$T $g,Lp6yg,6$)GfT <oT($){£g)lG$)IJlju&)i. 


• <oTS1/0LO /_//EYS<SyK<SS (pgJGUmU UJfTSU 0S0Z_0 fTSST QrBffiklSlGV S)S]<SUIJthJ 3 ,SS)Sn 
61/Z£/5Y0<S10/. 


• <-9/ss su l^s^&^jlci, /_/<s<ss/6YSOT/rssiy/_o / <si//_!_(5)ssOT/rssYy/_o S)S]smjthJ3,ss)sn 
Q«u6rihi51®6ijg 1 i. 


uriia<ofi&& 


• <Sl5]0LY/J(ipOTOT <oTSl/0LO /J/EYSSYKSSSO/Tzi). 


sl_l_/d/d SOTsfl/gyLL/zi) s/r/r/5<s <sy5)<sy$<u_//_o/ts ^)0^^<so Qsusm^lLb. 
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• urdansrfld^ Q&>rTi—d](&jih (Lpsirsmrj 3iswsfluj^^ljD(^ 2_dj3,^fT,es)i_uj u§>l /_}/_/ ffi ld <5«n<5 

< 31 <srfldg; <oT$ 5 lrjurTrjdg;Lju®!d]jfirjg;GrT. 

• editor@kaniyam.com (Lpaoji^a^ <^Lp<gs<gsis&&ri_oS]Qj[jrEj<$Gm_rij<£]iu 

LDi_Qa)rrdn'is»fD s_p^1Qlditl^Iujitlij ^GTfldsgjoSlLl® ilht0ld urEjaerflffiffijS 
Q^m_rij<sa)iTLb. 

• LDL-aShsir QurTQjjeh: u§duLnflLDLL ^srfluLi 

• ILL .SO a_ffljfflTi_««LD 


• <oTssrssrrrsb a,G$rf\ujf£d}l(D&,rrg, ^uuuu®id /_/«» z _ lvl/.sot 
^ssxssTib&jLb <5 sissflujd)d)l(r)ding> (Lp^ssT(Lp^(Mmu /j«n/_<g5<a/j/JL_/_^/7<g5 

2_ru^iiuerrldd]QjDsiT. 

• ^g,sirQurT(rijL-(d) sirnddl^daidd^L^uj u^Iuh$]ll^^Iss)Sst 
3nsufhu^^J/D(^ GULptd (&)&>)(]> in sir. 

• 2-(h]&>(§rr>i_G$)UJ (Lp(LguQuiurf, 


• ^rrthjs,sn udj&Gdlda, si 5 ?(75/_d/_//_d <s^0 u^^luSlev QsuQjDrTQ^suij <sj(D<su 
udi&sdliB&j Gutvjdl/nnrj <oT<s$f}sir <31 <su 0 Z_«S 7 ^ssxsmrB&j us^sfliufTforo (Lps^miusi^id. 

• 3,L.(d)ss)fj3^sn QLDrTLflQuiurfuLi <s surra, si^ld, sfihsyiUJLDjflrBg, <sp(ij6urj Glausvsvd 
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G<95/J_(p <*/?5/ry ujjnjnuuL-i_6is)<5iJUjrTg;<5iiLt) ^(rijdgisvmd. 

• /_/«OZ_/j/_/<95 6)T Q<5f7Z_/r<956y777<9561//_0 £g)(/5<95<956U/7/_0. 

• G)g,rrLfl&v ,/£/ l _/_//_ d , Qs,rrsnss)3, edlendgud, iSlrjd&rrtjLD, g,s$)g>, Qgi&Sldd^^lijLb, 

ss)rF,ujrrsmLq. <oTsstli ueogrSS)Sug,sdi&^Lb Qurrq^rB^iLbuLq.ujrrssT 

^ > dg,djg,snrrg, ^(njd&evrTLb. 

• & ) (h]g,(§fr,d(& ) £g) lusvurrm <orrB^Qsurr(r^ rbsmi—uSl&viid <oT(Lpg,Gvmd. 

• {BihJ&sn&j UG$)i_LjLig,G$)6n (oTGdlujQiBrrQj) 2_G$)[j ^susmLDrrg, 

editor@kaniyam.com (Log^suifidc^^ie^iuLdis^sudg^si^Lb. 

• <ssy7' urjrrLDrfluLi, ^djijGiJerring,<sv s_ff)Tisyfl l_z_ sjsmmiu sdlgdigsuflqviLb 
udjansrfldgisvmd. 

• ®oujihia,6i?i(rT)Ui36$T editor@kaniyam.com LDi^sShufotDs^Lb. 


<sSl<sm<s5sruum8i<sh 


• g,G$rf\f5 Q^rrLflsb^iidu^ss)^ <3iifihu stfl&r>Lpmid Lnd&^d&rrg, Qm/DGlgirTsherTuu^ud) 

(Lpiujodhurr(&)!.£) )§]. 

• ^jdjlsv udjgisrrldgi ^rrdjg,sn^ jd/dsv sumurB^eurrrrgi ^q^dg, Qsusm^ud 
(oTsirfD gudi—muLfilsvsinGV. 
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&)ih]&(6irij <950 Qgi firsts ®S}<ofyUJ{£6$)&) £g) tusirin (orerrluj (Lp&njnuShsv <oT(^)^^isr>!jd3i 
^rfeurn ^(r^rs^rrsv Qurr&jLb. 


• (g)<£gsr susntiddl mid spsuQsurrcjhsuffissr s^amSleviQm 2_ehsn^i. 


• @«f) i ®<g5(syf?(5i5)0/j/_S)<oOT (Lpssiroujrrs, Qg,!fULiuu(d)d>$si (LpsirQmjnin^^ijn^ «ul£I 
si/@<a<g5Siy/_D. 


QeijeifiidLL® oSleurjih 


U^IULlflLDLb © 2019 & 61 StifluJLti. 

a&ttfhuj5§>l<sv Q«u6rrlu5U_Lju®Lb 3,id(^\ss)ij3,sn 

http://creativecommons.org/licenses/bv-sa/3.0/ ud&£§ieb a-ekor 

dlflQujL^sij 3,rrLDG$r<siv Qi¥,rSl3,ss)snQujrT^§ t i su Lp rd aruu^dhsir in &sr. 


gglpdruL^, 

a&5$fhuj£§>l<5V QeuerihsuQ^Lb gnd®&s)[jg;®nm gusttfliurstsl/n^id z_/<o5)z_<s<g <oT(Lg^^rren(r^d(^Lb 
&rT6irj06frlf53)], rBaQsvQdg;, e^rffQiurrddai, UGSirn&rrrnrn, <cpjninuLrp ^ssmifB&jd 
Q&rrmm, Q^rrrfUsb Qrsrrddsv uiusiru@£53, ^rsnimdl euLpdj3iuu(s)din^]. 


^dufiiuij’. ^<ssf)swf7<?<s57 -editor@kaniyam.com +91 98417 95468 


&Ld(d)Gin!jg;®iflGV Gsuerrluud^^uudid 3,(^^^jd3,srr 3,rd<dss)ijU-irrd\$]uj([^dQ3, 2_rfUum 
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29. 


&<SSsfllULb sSHJt)£B«SBL L _ < 5 G)< 5 fT 




(§1 |D ® (6 l! L 3D6YI 


29.1 Q^rresxso (3/5/7350- Vision 


JJjL&Lp GlLDrTLfl LDfDQJLC) £g) <537.95 (0(i£ <95 <95(5)7 977/7/5<5 GlLDlUrfl 95/761/(SY7/E/<95(5)7, <950Sl5)<95(5)7 LDJD^JLD 
Q^rr^^igush, ^s^ssTsin^d^Lb g,Lh_rDjB ^9 6mi dg;d&iGV ®<55)/_950zi) tm,Lpsv 


29.2 Liessfl $}<sod(yj -Mission 


< 31 ($l<oiS)uj 6 b LDjnrruLC) 9 : (ip95/j Qurrq^surr^rru <sumridSd <950 spuu, g, l6)l£ Qm^LfluSlsir 
uiusirurTt^) susmjsuss)^ ^gu^luu(d)^^i<su^]L£i, <jy<5S) <557950/ <jy/)5)<si/95 Q^rr@{ 51 an^Lb, 
suenrhjgn^LD gadi_/Din ^i smi dgid&lsv ^sv)msu(^d(^LD &lG$)i_d&,dQg : iLi&,<sviLb. 
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29.3 <5/D(Szj/TC5><5tu Q&iuGvaefT 


• a,®.ssfhuLb l£isst 6sfi3Lo - kanivam.com 

• QtflQlUL.Lq.6ij 33LD65T3 tfl 65)LDu516V 6V 6U 3 3, l£ 1 Lf) LLl65T65)rT6i)36fT - 

FreeTamilEbooks.com 

29.4 ^lLi—JD/D QLDGsrQurTqijLL&m 

• g_gy>fr <s&(5i57 LDimotr 1 )) - Text to Speech 

• 6 T(LD 3 , 3 i 6 mtfi - Optical Character Recognition 

• 6)51dQ(>ij)6vd3ld3365T 6T(Lp33l6mifl 

• u5l 65T 65) tT 6V 3 6YT Q655TLQ.6V 3(3,633(3, ^U65)tUU36V - Send2Kjndle 

• efildQLi iJlq-Uj rrefi)rp3rT65T dlir)i 3(36)51363 

• lS\ 65T65)36V 3 63 2_(3>6)J 33 ( 3 , 13 ) 3 ( 36)51 

• 2-65)3 6&&S1 LD3rn/r5l- £§ )65)655TUJ Gl3UJ6l5l 

• 3tZJ3 @ SVdOlUJLi) - ^M.65TLQ-33lU(h) Gl3lU®5l 

• FreeTamilEbooks — ^L 65 ng. 33 ujCiit Gi 3 uje 6 l 

• FreeTamilEbooks — g> 6 p 6 T 6 ro Gi 3 ius 6 l 

• WikisourceEbooksReport GfdQm GiLD3L£i3(md33365T 6)51 dQieLD sold 


242 















L£lsirsv)fTSV3isrT u&lGtfl/ndgju ulIl^iusv 


29.5 


FreeTamilEbooks.com - Download counter l® mentis,®b 

U§}!<3)S)(D'&3>Lj UL.L^UJ6V 


^¥®<!5<i5 &>IlLi _ LDGStG)UITQJ jLL&iSb 


®S)dQ (y)6V&>§?l&V 2-6YT6YT LS\SSTGS)ITsb3,SS)Sn U({h&tGfhn!(U jOJj (3/5/7LV USWsfhlJfTSYTljsiSYT 
^LpSVLD &d&S)ljrB^] lSho5)Lp ^l([^^^]^SV 

(Lp(Lp (3/5/7 rflrjGvemrj usmsfhuLCirj^^l usvQsurn aali^rDjD Q/-O<s57Ql//70l_<sioit 

0 si/ rrd (&)f 56 V 


8)LfilLp NTP d&nm uuSlrodlu LiL-ism/oastr r5i_J5§]3,6V 
asssfhuLD s un&arj si//J_/_z_o 2_0<si//7<950<5<si> 


&L-I—JDJD (olLLSS tQU /70L_<asy7 / ®/$(3u_//J_7!J_Qj t$/7Z_0<o57<9r 2_$]SS)LDuSlsb 6UGfTlh]&,6$)GfT 
2_0Sl//7<a0L/Sl//7<g5(o5)(5Y7<a <95<oSW/_/)5)/50/ 2Md(&) stiff) 3,6V 


ffSttfllULD ^g,Lfl<5V <3]§?lff LI fhJffSTfl U LI ITSYTljffSIDSYT 2-QT)6U ITff>(ff)3)6V , Ultiff )£) <S>1 stiff, 3,6V 


lSIsstss)[tgv rrffffffffidfff <s ^0 ^j&sismiu^sn Qffiusti 
<oT(Lg^^]smfld(^ spQjj ^ssismiu^sn Qffiusti 
g,L£)Lp spstiGiurrssii—ffsrT 2_0<si//7<s® QsiistihfilQdjSV 

OpenStreetMap.org <si>2_<sy7w£g)/_z_o, Qff,(ff), £«z/77 Quiurjffssisn^LtiLprrdsud 
Qffiu3,sv 
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g,L£iLpr5rr(j)i (Lp(Lgsu&s)^iLjLb OpenStreetMap.org GV GU (oft) IJ&)(5V 


• (&,Lpr5<5ing,d 365)^365)63 epsS) euL^ffSlev 65L£3i(3,3)6V 

• Ta.wiktionary.org ®o 6p(Lprd(3,u(d)£5§)i API «0 ld3/d^j^6v 

• Ta.wiktionary.org 3333 Q3iuiLiLb Q31U&S] 2-^6533(3,^65 

• 3)L£)l£ <oT(Lf){53Jlj lSl65>L£3,§)l(ri)d,§)l 2-(3,6533(3,3)65 

• 3)L6)L£ (36UljdQ336V 33 655)I LD 3(3)6)5) 2-(3j6533(3,3)65 

• (5T (oV GV FT FreeTamilEbooks.com LEi65T^]j6b36S)6mnd) Google Play 
Books, GoodReads.com <si> ^roru^sv 

• 3}L£)l£ fBLdl—33 33)3 ^65>655TUJ Gl3UJ6)Sl 2-(3)6533(3,3)65 

• 3)lJ)lL£ 6J(LQ3>6)-IL-D UL^336^Lb 3fT)fD ^j65)655TUJ Q3UJ&S] 2-(3,6533(3,3)65 ( 

aamozish.com/Course_preface Qu3sv) 


($153)36555 -_ S)lL-l_(h]363, QLD65 tQU 3(3)L-365>63 2-(3j6533^1 Q 3IL165 LI ($1653) 2-31363 
^65)65165fl 65T 61^15 ($<565)65. 2-31363365 3165653($3)65)ILL ) U3136di33 ^lUSglLD <oT6sfi65 

2-31363 6)S)65rrihi36$>63 kaniyamfoundation(cD gmail.com d(3, l£)65T65T(6T)36v 

39! 65)1 ULj 31363. 
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29.6 Q eusiftuu <ss) i_.ppm< ss)ld 


&G^iuLb<3](D&>3>L-i—G$)6nu5lG$TQg : uj6b3,6rT, £i)/_!_/_/5/<s6y7, QmsirQurTQ^L.^sh lurrsi^LD 
65) <55761/0<950/_O QU /70/61//7657<5/7<95 61//_£>, 100% Q 6116yfl/j/J 651/_<90 65765) TO////_ SV)ILD 
G^mdc&ib. (§)d& (j^svnsmuiSJ&v Q&msbgiGsxsmLjLb. (^6v>6muiSJ&v LDrrg, ^//5)<s65)<s, 

6i//T6iy Q&gv<3 iy s61 sundj3i(Sfh l _ ss)i ld « rrsmevmb. 

a&ttfhuLD ^jodaubi^s^enublsv 2_06i/ rrda,iJu®ib Q/_o657O/_//70/J_ s,sn lurrsyib aubi^fb/D 

QLD6$iG)urT(njL2&>6fTna> ^Lpso rffrrsvii—sir. GNU GPL, Apache, BSD, MIT, 
Mozilla ^Shu 2_rfl<5&)LD3;®i?l&v <s^<dijDns> Ghsu<srflu51i_uu(d)Lb. ^(r^sund^uu^iib iSl/n 

sumrhjgim, l/65i<s/_}/_//_/ si <956ir, <s^6i5]t$(5<5/7/_7/_/<56y7", ainQsmmsrrlgisrT, L£\ssrss)[isb3,sn. 
3,lb(^SS)IJS,Sn U_//761//_D UJ nGUQTjlb udlQ^LCi, UUJSSTU^^ffJlb SUSS)S,u5lsb dlrflQlULdL^sij 
3,rrLDSST3r g-filemmiSlsv ^q^d^ib. 


29.7 r5<osrQ&rT<s!f)L- 


2—fhJ3iSYT rf)6ir(oldin6V)L _<9>6Y7 3)L£l(Lpd>3HT<o5T 3 >l2l _ JDJD 611611751 <S 65) 6Y7 2_061//7<S0LO Qff (7161) <9565)6)7 

&/D[53> 6U6V>&u5l6V 6l5)65)/7/70/ Q&IUUJ 2®ZT<S0(Sl5)<S0/_O. 


/_5) <5576110/_£) 61@@<95 <95 655795 @ <51? g_/B/a6Y7 /75657(a)95/765)/_9565)6)T ^USVtlUlS) . 2_/_(3<6S7 

6i5? 611 /r/5/ag<»6y7 kaniyamfoundation@gmail.com lS\ssts 

^gl/Ll/-/151956)7. 
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Kaniyam Foundation 

Account Number: 606 1010 100 502 79 


Union Bank Of India 
West Tambaram, Chennai 

IFSC - UBIN0560618 


Account Type : Current Account 


29.8 UPI Q&iu<sSl&(gn>d&rT6sr QR Code 


d)sv UPI Gl^iu&Slai&iflsv QR Code Qsussisv GlgiuiumDsb GurrasvrTLb. 
d&LDUJLD QldQ<SV 2 _<SY7W GUfhldld &6md@ <oTSm, IFSC code ®£> UU_/<o577_/(/))<5<g<Sl//_0. 


Note: Sometimes UPI does not work properly, in that case kindly use Account 
number and IFSC code for internet banking. 
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BHIM UPI Payments Accepted at 
Kaniyam Foundation 



Account Number: 606101010050279, IFSC Code: UBIN0560618 
Scan and Pay using any UPI supported Apps 

0 * 0 - 8 , t 

iMobile BHIM PhonePe PayTM SBI Pay Google Tez 



RBL Pay DBS PNB UPI Yes Pay AXIS Pay Chiller 

(0) [■] IT B ~ I 

Union UPI HDFC Baroda Pay Indus Pay BOI UPI Maha UPI 


Generated By: https://nsisodiya.github.io/Universal-UPI-QR-Code-Generator 
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